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METHODS FOR GENERATING DATABASES AND DATABASES FOR 
IDENTIFYING POLYMORPHIC GENETIC MARKERS 

RELATED APPLICATIONS 

Benefit of priority to the following applications is claimed herein: 
5 U.S. provisional application Serial No. 60/217,658 to Andreas Braun, Hubert 
Koster; Dirk Van den Boom, filed July 10, entitled "METHODS FOR 
GENERATING DATABASES AND DATABASES FOR IDENTIFYING 
POLYMORPHIC GENETIC MARKERS"; U.S. provisional application Serial No. 
60/159,176 to Andreas Braun, Hubert Koster, Dirk Van den Boom, filed October 

10 13, 1999, entitled "METHODS FOR GENERATING DATABASES AND 

DATABASES FOR IDENTIFYING POLYMORPHIC GENETIC MARKERS"; U.S. 
provisional application Serial No. 60/217,251, filed July 10, 2000, to Andreas 
Braun, entitled "POLYMORPHIC KINASE ANCHOR PROTEIN GENE SEQUENCES, 
POLYMORPHIC KINASE ANCHOR PROTEINS AND METHODS OF DETECTING 

15 POLYMORPHIC KINASE ANCHOR PROTEINS AND NUCLEIC ACIDS ENCODING 
THE SAME"; and U.S. application Serial No. 09/663,968, to Ping Yip, filed 
September 19, 2000, entitled "METHOD AND DEVICE FOR IDENTIFYING A 
BIOLOGICAL SAMPLE." 

Where permitted that above-noted applications and provisional 

20 applications are incorporated by reference in their entirety. 
FIELD OF THE INVENTION 

Process and methods for creating a database of genomic samples from 
healthy human donors. Methods that use the database to identify and correlate 
with polymorphic genetic markers and other markers with diseases and 

25 conditions are provided. 
BACKGROUND 

Diseases in all organisms have a genetic component, whether inherited or 
resulting from the body's response to environmental stresses, such as viruses 
and toxins. The ultimate goal of ongoing genomic research is to use this 
30 information to develop new ways to identify, treat and potentially cure these 
diseases. The first step has been to screen disease tissue and identify genomic 
changes at the level of individual samples. The identification of these "disease" 
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markers has then fueled the development and commercialization of diagnostic 
tests that detect these errant genes or polymorphisms. With the increasing 
numbers of genetic markers, including single nucleotide polymorphisms (SNPs), 
microsatellites, tandem repeats, newly mapped introns and exons, the challenge 
5 to the medical and pharmaceutical communities is to identify genotypes which 
not only identify the disease but also follow the progression of the disease and 
are predictive of an organism's response to treatment. 

Currently the pharmaceutical and biotechnology industries find a disease 
and then attempt to determine the genomic basis for the disease. This approach 
10 is time consuming and expensive and in many cases involves the investigator 
guessing as to what pathways might be involved in the disease. 
Genomics 

Presently the two main strategies employed in analyzing the available 
genomic information are the technology driven reverse genetics brute force 

15 strategy and the knowledge-based pathway oriented forward genetics strategy. 
The brute force approach yields large databases of sequence information but 
little information about the medical or other uses of the sequence information. 
Hence this strategy yields intangible products of questionable value. The 
knowledge-based strategy yields small databases that contain a lot of 

20 information about medical uses of particular DNA sequences and other products 
in the pathway and yield tangible products with a high value. 
Polymorphisms 

Polymorphisms have been known since 1901 with the identification of 
blood types. In the 1950's they were identified on the level of proteins using 

25 large population genetic studies. In the 1980's and 1 990's many of the known 
protein polymorphisms were correlated with genetic loci on genomic DNA. For 
example, the gene dose of the apolipoprotein E type 4 allele was correlated with 
the risk of Alzheimer's disease in late onset families {see, e.g., Corder et al. 
(1993) Science 267: 921-923; mutation in blood coagulation factor V was 

30 associated with resistance to activated protein C (see, e.g., Bertina et al. (1994) 
Nature 365:64-67); resistance to HIV-1 infection has been shown in Caucasian 
individuals bearing mutant alleles of the CCR-5 chemokine receptor gene (s , 
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e.g.. Samson et aL (1996) Nature 382:722-725); and a hypermutable tract in 
antigen presenting cells (APC, such as macrophages), has been identified in 
familial colorectal cancer in individuals of Ashkenzi jewish background (see, e.g., 
Laken et aL (1997) Nature Genet. 7 7:79-83). There may be more than three 
5 million polymorphic sites in the human genome. Many have been identified, but 
not yet characterized or mapped or associated with a marker. 
Single nucleotide polymorphisms (SNPs) 

Much of the focus of genomics has been in the identification of SNPs, 
which are important for a variety of reasons. They allow indirect testing 
10 (association of haplotypes) and direct testing (functional variants). They are the 
most abundant and stable genetic markers. Common diseases are best 
explained by common genetic alterations, and the natural variation in the human 
population aids in understanding disease, therapy and environmental 
interactions. 

15 Currently, the only available method to identify SNPs in DNA is by 

sequencing, which is expensive, difficult and laborious. Furthermore, once a 
SNP is discovered it must be validated to determine if it is a real polymorphism 
and not a sequencing error. Also, discovered SNPs must then be evaluated to 
determine if they are associated with a particular phenotype. Thus, there is a 

20 need to develop new paradigms for identifying the genomic basis for disease and 
markers thereof. Therefore, it is an object herein to provide methods for 
identifying the genomic basis of disease and markers thereof. 
SUMMARY 

Databases and methods using the databases are provided herein. The 
25 databases comprise sets of parameters associated with subjects in populations 
selected only on the basis of being healthy (/.e., where the subjects are 
mammals, such as humans, they are selected based upon apparent health and 
no detectable infections). The databases can be sorted based upon one or more 
of the selected parameters. 
30 The databas s are preferably relational databases, in which an index that 

represents each subject serves to relate parameters, which are the data, such as 
age, ethnicity, sex, medical history, etc. and ultimately genotypic information. 
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that was inputted into and stored in the database. The database can then be 
sorted according to these parameters. Initially, the parameter information is 
obtained from a questionnaire answered by each subject from whom a body 
tissue or body fluid sample is obtained. As additional information about each 
5 sample is obtained, this information can be entered into the database and can 
serve as a sorting parameter. 

The databases obtained from healthy individuals have numerous uses, 
such as correlating known polymorphisms with a phenotype or disease. The 
databases can be used to identify alleles that are deleterious, that are beneficial, 
10 and that are correlated with diseases. 

For purposes herein, genotypic information can be obtained by any 
method known to those of skill in the art, but is preferably obtained using mass 
spectrometry. 

Also provided herein, is a new use for existing databases of subjects and 

15 genotypic and other parameters, such as age, ethnicity, race, and gender. Any 
database can be sorted according to the methods herein and alleles that exhibit 
statistically significant correlations with any of the sorting parameters can be 
identified. It is noted, however, is noted, that the databases provided herein and 
randomly selected databases will perform better in these methods, since disease- 

20 based databases suffer numerous limitations, including their relatively small size, 
the homogeneity of the selected disease population, and the masking effect of 
the polymorphism associated with the markers for which the database was 
selected. Hence, the healthy database provided herein, provides advantages not 
heretofore recognized or exploited. However, the methods provided herein can 

25 be used with a selected database, including disease-based databases, with or 
without sorting for the discovery and correlation of polymorphisms. In addition, 
the databases provided herein represent a greater genetic diversity than the 
unselected databases typically utilized for the discovery of polymorphisms and 
thus allow for the enhanced discovery and correlation of polymorphisms. 

30 The databases provided herein can be used for taking an identified 

polymorphism, and ascertaining whether it changes in frequency when the data 
is sorted according to a selected parameter. 
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One use of these methods is correlating a selected marker with a 
particular parameter by following the occurrence of known genetic markers and 
then, having made this correlation, determining or identifying correlations with 
diseases. Examples of this use are p53 and Lipoprotein Lipase polymorphism. 
5 As exemplified herein, known markers are shown to have particular correlation 
with certain groups, such as a particular ethnicity or race or one sex. Such 
correlations will then permit development of better diagnostic tests and 
treatment regimens. 

These methods are valuable for identifying one or more genetic markers 
10 whose frequency changes within the population as a function of age, ethnic 

group, sex or some other criteria. This can allow the identification of previously 
unknown polymorphisms and ultimately a gene or pathway involved in the onset 
and progression of disease. 

The databases and methods provided herein permit, among other things, 
15 identification of components, particularly key components, of a disease process 
by understanding its genetic underpinnings and also permit an understanding of 
processes, such as individual drug responses. The databases and methods 
provided herein also can be used in methods involving elucidation of pathological 
pathways, in developing new diagnostic assays, identifying new potential drug 
20 targets, and in identifying new drug candidates. 

The methods and databases can be used with experimental procedures, 
including, but are not limited to, in silico SNP identification, in vitro SNP 
identification/verification, genetic profiling of large populations, and in 
biostatistical analyses and interpretations. 
25 Also provided herein, are combinations that contain a database provided 

herein and a biological sample from a subject in the database, and preferably 
biological samples from all subjects or a plurality of subjects in the database. 
Collections of the tissue and body fluid samples are also provided. 

Also, provided herein, are methods for determining a genetic marker that 
30 correlates with age, comprising identifying a polymorphism and determining the 
frequency of the polymorphism with increasing age in a healthy population. 
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Further provided herein are methods for determining whether a genetic 
marker correlates with susceptibility to morbidity, early mortality, or morbidity 
and early mortality, comprising identifying a polymorphism and determining the 
frequency of the polymorphism with increasing age in a healthy population. 
5 Any of the methods herein described can be used out in a multiplex 

format. 

Also provided are an apparatus and process for accurately identifying 
genetic information. It is another object of the herein that genetic information be 
extracted from genetic data in a highly automated manner. Therefore, to 
10 overcome the deficiencies in the known conventional systems, a method and 
apparatus for identifying a biological sample is proposed. 

Briefly, the method and system for identifying a biological sample 
generates a data set indicative of the composition of the biological sample. In a 
particular example, the data set is DNA spectrometry data received from a mass 
15 spectrometer. The data set is denoised, and a baseline is deleted. Since 

possible compositions of the biological sample may be known, expected peak 
areas may be determined. Using the expected peak areas, a residual baseline is 
generated to further correct the data set. Probable peaks are then identifiable in 
the corrected data set, which are used to identify the composition of the 
20 biological sample. In a disclosed example, statistical methods are employed to 
determine the probability that a probable peak is an actual peak, not an actual 
peak, or that the data too inconclusive to call. 

Advantageously, the method and system for identifying a biological 
sample accurately makes composition calls in a highly automated manner. In 
25 such a manner, complete SNP profile information, for example, may be collected 
efficiently. More importantly, the collected data is analyzed with highly accurate 
results. For example, when a particular composition is called, the result may be 
relied upon with great confidence. Such confidence is provided by the robust 
computational process employed . 
30 DESCRIPTION OF THE DRAWINGS 
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Figure 1 depicts an exemplary sample bank. Panel 1 shows the samples 
as a function of sex and ethnicity. Panel 2 shows the Caucasians as a function 
of age. Panel 3 shows the Hispanics as a function of age. 

Figures 2A and 2C show an age- and sex-distribution of the 291 S allele 
5 of the lipoprotein lipase gene in which a total of 436 males and 589 females 
were investigated. Figure 2B shows an age distribution for the 436 males. 

Figure 3 is an exemplary questionnaire for population-based sample 
banking. 

Figure 4 depicts processing and tracking of blood sample components. 
10 Figure 5 depicts the allelic frequency of "sick" alleles and "healthy" 

alleles as a function of age. It is noted that the relative frequency of healthy 

alleles increases in a population with increasing age. 

Figure 6 depicts the age-dependent distribution of ApoE genotypes (see, 

Schachter et at. (1994) Nature Genetics 6:29-32). 
15 Figure 7A-D depicts age-related and genotype frequency of the p53 

(tumor suppressor) codon 72 among the Caucasian population in the database. 

*R72 and *P72 represent the frequency of the allele in the database population. 

R72, R72P, and P72 represent the genotypes of the individuals in the population. 

The frequency of the homozygous P72 allele drops from 6.7% to 3.7% with 
20 age. 

Figure 8 depicts the allele and genotype frequencies of the p21 S31R 
allele as a function of age. 

Figure 9 depicts the frequency of the FVII Allele 353Q in pooled versus 
individual samples. 

25 Figure 10 depicts the frequency of the CETP (cholesterol ester transfer 

protein) allele in pooled versus individual samples 

Figure 1 1 depicts the frequency of the plasminogen activator inhibitor-1 
(PAI-1) 5G in pooled versus individual samples 

Figure 1 2 shows mass spectra of the samples and the ethnic diversity of 
30 the PAI-1 alleles. 

Figure 13 shows mass spectra of the samples and the ethnic diversity of 
the CETP 405 alleles. 
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Figure 14 shows mass spectra of the samples and the ethnic diversity of 
the Factor VII 353 alleles. 

Figure 15 shows ethnic diversity of PAI-1 , CETP and Factor VII using the 
pooled DNA samples. 
5 Figure 16 shows the p53-Rb pathway and the relationships among the 

various factors in the pathway. 

Figure 1 7, which is a block diagram of a computer constructed to provide 
and process the databases described herein, depicts a typical computer system 
for storing and sorting the databases provided herein and practicing the methods 
10 provided herein. 

Figure 1 8 is a flow diagram that illustrates the processing steps 
performed using the computer illustrated in Figure 17, to maintain and provide 
access to the databases for identifying polymorphic genetic markers. 

Figure 1 9 is a histogram showing the allele and genotype distribution in 
15 the age and sex stratified Caucasian population for the AKAP10-1 locus. Bright 
green bars show frequencies in individuals younger than 40 years. Dark green 
bars show frequencies in individuals older than 60 years. 

Figure 20 is a histogram showing the allele and genotype distribution in 
the age and sex stratified Caucasian population for the AKAP10-5 locus. Bright 
20 green bars show frequencies in individuals younger than 40 years; dark green 
bars show frequencies in individuals older than 60 years. 

Figure 21 is a histogram showing the allele and genotype distribution in 
the age and sex stratified Caucasian population for the h-msrA locus. Genotype 
difference between male age groups is significant. Bright green bars show 
25 frequencies in individuals younger than 40 years. Dark green bars show 
frequencies in individuals older than 60 years. 

Figure 22A-D is a sample data collection questionnaire used for the 
healthy database. 

Figure 23 is a flowchart showing processing performed by the computing 
30 device of Figure 24 wh n performing genotyping of sense strands and antis nse 
strands from assay fragm nts. 
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Figure 24 is a block diagram showing a system in accordance with the 
present invention; 

Figure 25 is a flowchart of a method of identifying a biological sample in 
accordance with the present invention; 
5 Figure 26 is a graphical representation of data from a mass spectrometers- 

Figure 27 is a diagram of wavelet transformation of mass spectrometry 

data; 

Figure 28 is a graphical representation of wavelet stage 0 hi data; 
Figure 29 is a graphical representation of stage 0 noise profile; 
10 Figure 30 is a graphical representation of generating stage noise standard 

deviations; 

Figure 31 is a graphical representation of applying a threshold to data 

stages; 

Figure 32 is a graphical representation of a sparse data set; 
15 Figure 33 is a formula for signal shifting; 

Figure 34 is a graphical representation of a wavelet transformation of a 
denoised and shifted signal; 

Figure 35 is a graphical representation of a denoised and shifted signal; 
Figure 36 is a graphical representation of removing peak sections; 
20 Figure 37 is a graphical representation of generating a peak free signal ; 

Figure 38 is a block diagram of a method of generating a baseline 
correction; 

Figure 39 is a graphical representation of a baseline and signals- 
Figure 40 is a graphical representation of a signal with baseline removed; 

25 Figure 41 is a table showing compressed data; 

Figure 42 is a flowchart of method for compressing data; 
Figure 43 is a graphical representation of mass shifting; 
Figure 44 is a graphical representation of determining peak width; 
Figure 45 is a graphical representation of removing peaks; 

30 Figure 46 is a graphical representation of a signal with peaks removed; 

Figure 47 is a graphical representation of a residual baseline; 
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Figure 48 is a graphical representation of a signal with residual baseline 
removed; 

Figure 49 is a graphical representation of determining peak height; 
Figure 50 is a graphical representation of determining signal-to-noise for 
5 each peak; 

Figure 51 is a graphical representation of determining a residual error for 
each peak; 

Figure 52 is a graphical representation of peak probabilities; 
Figure 53 is a graphical representation of applying an allelic ratio to peak 
10 probability; 

Figure 54 is a graphical representation of determining peak probability 
Figure 55 is a graphical representation of calling a genotype; 
Figure 56 is a flowchart showing a statistical procedure for calling a 
genotype; 

15 Figure 57 is a flowchart showing processing performed by the computing 

device of Figure 1 when performing standardless genotyping; and 

Figure 58 is graphical representation of applying an allelic ratio to peak 
probability for standardless genotype processing. 
DETAILED DESCRIPTION 

20 Definitions 

Unless defined otherwise, all technical and scientific terms used herein 
have the same meaning as is commonly understood by one of ordinary skill in 
the art to which this invention belongs. All patents, applications, published 
applications and other publications and sequences from GenBank and other 

25 databases referred to herein throughout the disclosure are incorporated by 
reference in their entirety. 

As used herein, a biopolymer includes, but is not limited to, nucleic acid, 
proteins, polysaccharides, lipids and other macromolecules. Nucleic acids 
include DNA, RNA, and fragments thereof. Nucleic acids may be derived from 

30 genomic DNA, RNA, mitochondrial nucleic acid, chloroplast nucleic acid and 
other organelles with separate genetic material. 
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As used herein, morbidity refers to conditions, such as diseases or 
disorders, that compromise the health and well-being of an organism, such as an 
animal. Morbidity susceptibility or morbidity-associated genes are genes that, 
when altered, for example, by a variation in nucleotide sequence, facilitate the 
5 expression of a specific disease clinical phenotype. Thus, morbidity 

susceptibility genes have the potential, upon alteration, of increasing the 
likelihood or general risk that an organism will develop a specific disease. 

As used herein, mortality refers to the statistical likelihood that an 
organism, particularly an animal, will not survive a full predicted lifespan. 
10 Hence, a trait or a marker, such as a polymorphism, associated with increased 
mortality is observed at a lower frequency in older than younger segments of a 
population. 

As used herein, a polymorphism, e.g. genetic variation, refers to a 
variation in the sequence of a gene in the genome amongst a population, such as 

15 allelic variations and other variations that arise or are observed. Thus, a 

polymorphism refers to the occurrence of two or more genetically determined 
alternative sequences or alleles in a population. These differences can occur in 
coding and non-coding portions of the genome, and can be manifested or 
detected as differences in nucleic acid sequences, gene expression, including, 

20 for example transcription, processing, translation, transport, protein processing, 
trafficking, DNA synthesis, expressed proteins, other gene products or products 
of biochemical pathways or in post-translational modifications and any other 
differences manifested amongst members of a population. A single nucleotide 
polymorphism (SIMP) refers to a polymorphism that arises as the result of a single 

25 base change, such as an insertion, deletion or change in a base. 

A polymorphic marker or site is the locus at which divergence occurs. 
Such site may be as small as one base pair (an SNP). Polymorphic markers 
include, but are not limited to, restriction fragment length polymorphisms, 
variable number of tandem repeats (VNTR's), hypervariable regions, 

30 minisatellites, dinucleotide repeats, trinucleotide repeats, tetranucleotide repeats 
and other repeating patterns, simple sequence repeats and insertional elements, 
such as Alu. Polymorphic forms also are manifested as different mendefian 
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alleles for a gene. Polymorphisms may be observed by differences in proteins, 
protein modifications, RNA expression modification, DNA and RNA methylation, 
regulatory factors that alter gene expression and DNA replication, and any other 
manifestation of alterations in genomic nucleic acid or organelle nucleic acids. 
5 As used herein, a healthy population, refers to a population of organisms, 

including but are not limited to, animals, bacteria, viruses, parasites, plants, 
* eubacteria, and others, that are disease free. The concept of disease-free is a 
function of the selected organism. For example, for mammals it refers to a 
subject not manifesting any disease state. Practically a healthy subject, when 

10 human, is defined as human donor who passes blood bank criteria to donate 
blood for eventual use in the general population. These criteria are as follows: 
free of detectable viral, bacterial, mycoplasma, and parasitic infections; not 
anemic; and then further selected based upon a questionnaire regarding history 
(see Figure 3). Thus, a healthy population represents an unbiased population of 

15 sufficient health to donate blood according to blood bank criteria, and not further 
selected for any disease state. Typically such individuals are not taking any 
medications. For plants, for example, it is a plant population that does not 
manifest diseases pathology associated with plants. For bacteria it is a bacterial 
population replicating without environmental stress, such as selective agents, 

20 heat and other pathogens. 

As used herein, a healthy database (or healthy patient database) refers to 
a database of profiles of subjects that have not been pre-selected for any 
particular disease. Hence, the subjects that serve as the source of data for the 
database are selected, according to predetermined criteria, to be healthy. In 

25 contrast to other such databases that have been pre-selected for subjects with a 
particular disease or other characteristic, the subjects for the database provided 
herein are not so-selected. Also, if the subjects do manifest a disease or other 
condition, any polymorphism discovered or characterized should be related to an 
independent disease or condition. In a preferred embodiment, where the 

30 subjects are human, a healthy subject manifests no disease symptoms and 
meets criteria, such as those set by blood banks for blood donors. 
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Thus, the subjects for the database are a population of any organism, 
including, but are not limited to, animals, plants, bacteria, viruses, parasites and 
any other organism or entity that has nucleic acid. Among preferred subjects are 
mammals, preferably, although not necessarily, humans. Such a database can 
5 capture the diversity of the a population, thus providing for discovery of rare 
polymorphisms. 

As used herein, a profile refers to information relating to, but not limited 
to and not necessarily including all of, age, sex, ethnicity, disease history, family 
history, phenotypic characteristics, such as height and weight and other relevant 

10 parameters. A sample collect information form is shown in Figure 22, which 
illustrates profile intent. 

As used herein, a disease state is a condition or abnormality or disorder 
that may be inherited or result from environmental stresses, such as toxins, 
bacterial, fungal and viral infections. 

15 As used herein, set of non-selected subjects means that the subjects 

have not been pre-selected to share a common disease or other characteristic. 
They can be selected to be healthy as defined herein. 

As used herein, a phenotype refers to a set of parameters that includes 
any distinguishable trait of an organism. A phenotype can be physical traits and 

20 can be, in instances in which the subject is an animal, a mental trait, such as 

emotional traits. Some phenotypes can be determined by observation elicited by 
questionnaires (see, e.g., Figures 3 and 22) or by referring to prior medical and 
other records. For purposes herein, a phenotype is a parameter around which 
the database can be sorted. 

25 As used herein, a parameter is any input data that will serve as a basis 

for sorting the database. These parameters will include phenotypic traits, 
medical histories, family histories and any other such information elicited from a 
subject or observed about the subject. A parameter may describe the subject, 
some historical or current environmental or social influence experienced by the 

30 subject, or a condition or environm ntal influence on someone related to the 
subj ct. Paramaters include, but are n t limited to, any f those described 
herein, and known to th se of skill in the art. 
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As used herein, haplotype refers referes to two or polymorphism located 
on a single DNA strand. Hence, haplotyping refers to identification of two or 
more polymorphisms on a single DNA strand. Haplotypes can be indicative of a 
phenotype. For some disorders a single polymorphism may suffice to indicate a 
5 trait; for others a plurality (i.e., a haplotype) may be needed. Haplotyping can be 
performed by isolating nucleic acid and separating the strands. In addition, 
when using enzymes such a certain nucleases, that produce, different size 
fragments from each strand, strand separation is not needed for haplotyping. 

As used herein, used herein, pattern with reference to a mass spectrum 

10 or mass spectrometry analyses, refers to a characteristic distribution and 
number of signals (such peaks or digital representations thereof). 

As used herein, signal in the context of a mass spectrum and analysis 
thereof refers to the output data, which the number or relative number of 
moleucles having a particular mass. Signals include "peaks" and digital 

15 representations thereof. 

As used herein, adaptor, when used with reference to haplotyping use 
Fen ligase, refers to a nucleic acid that specifically hybridizes to a polymorphism 
of insterest. An adaptor can be partially double-stranded. An adaptor complex 
is formed when an adaptor hybridizes to its target. 

20 As used herein, a target nucleic acid refers to any nucleic acid of interest 

in a sample. It can contain one or more nucleotides. 

As used herein, standardless analysis refers to a determination based 
upon an internal standard. For example, the frequency of a polymorphism can be 
determined herein by comparing signals within a single mass spectrum. 

25 As used herein, amplifying refers to means for increasing the amount of a 

bipolymer, especially nucleic acids. Based on the 5' and 3' primers that are 
chosen, amplication also serves to restrict and define the region of the genome 
which is subject to analysis. Amplification can be by any means known to those 
skilled in the art, including use of the polymerase chain reaction (PCR) etc. 

30 Amplification, e.g., PCR must be done quantitatively when the frequency of 
polymorphism is required to be determined. 
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As used herein, cleaving refers to non-specific and specific fragmentation 
of a biopolymer. 

As used herein, multiplexing refers to the simultaneous detection of more 
than one polymorphism. Methods for performing multiplexed reactions, 
5 particularly in conjunction with mass spectrometry are known (see, e.g., U.S. 
Patent Nos. 6,043,031, 5,547,835 and International PCT application No. 
WO 97/37041). 

As used herein, reference to mass spectrometry encompasss any suitable 
mass spectrometric format known to those of skill in the art. Such formats 

10 cinlude. but are not limited to, Matrix-Assisted Laser Desorption/lonization, 

Time-of-Fiight (MALDI-TOF), Electrospray (ES), IR-MALDI (see, e.g., published 
International PCT application No. 99/57318 and U.S. Patent No. 5,118,937), Ion 
Cyclotron Resonance (ICR), Fourier Transform and combinations thereof. 
MALDI, particular UV and IR, are among the preferred formats. 

15 As used herein, mass spectrum refers to the presentation of data 

obtained from analyzing a biopolymer or fragment thereof by mass spectrometry 
either graphically or encoded numerically. 

As used herein, a blood component is a component that is separated from 
blood and includes, but is not limited to red blood cells and platelets, blood 

20 clotting factors, plasma, enzymes, plasminogen, immunoglobulins. A cellular 
blood component is a component of blood, such as a red blood cell, that is a 
cell. A blood protein is a protein that is normally found in blood. Examples of 
such proteins are blood factors VII and VIII. Such proteins and components are 
well-known to those of skill in the art. 

25 As used herein, plasma can be prepared by any method known to those 

of skill in the art. For example, it can be prepared by centrifuging blood at a 
force that pellets the red cells and forms an interface between the red cells and 
the buffy coat, which contains leukocytes, above which is the plasma. For 
example, typical platelet concentrates contain at least about 10% plasma. 

30 Blood may be separated into its components, including, but not limited to, 

plasma, platelets and red blood cells by any method known to those of skill in 
the art. For example, blood can be centrifuged for a sufficient time and at a 
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sufficient acceleration to form a pellet containing the red blood cells. Leukocytes 
collect primarily at the interface of the pellet and supernatant in the buffy coat 
region. The supernatant, which contains plasma, platelets, and other blood 
components, may then be removed and centrifuged at a higher acceleration, 
5 whereby the platelets pellet. 

As used herein, p53 is a cell cycle control protein that assesses DNA 
damage and acts as a transcription factor regulation gene which control cell 
growth, DNA repair and apoptosis. The p53 mutations have been found in a 
wide variety of different cancers, including all of the different types of leukemia, 
10 with varying frequency. The loss of normal p53 functions results in genomic 
instability and uncontrolled growth of the host cell. 

As used herein, p21 is a cyclin-dependent kinase inhibitor, associated 
with G1 phase arrest of normal cells. Expression triggers apoptosis or 
programmed cell death and has been associated with Wilms' tumor, a pediatric 
15 kidney cancer. 

As used herein, Factor VII is a serine protease involved the extrinsic blood 
coagulation cascade. This factor is activated by thrombin and works with tissue 
factor (Factor III) in the processing of Factor X to Factor Xa. Evidence has 
supported an association between polymorphisms in the gene and increase 
20 Factor VII activity which can result in an elevated risk of ischemic cardiovascular 
disease including myocardial infarction. 

As used herein, a relational database stores information in a form 
representative of matrices, such as two-dimensional tables, including rows and 
columns of data, or higher dimensional matrices. For example, in one 
25 embodiment, the relational database has separate tables each with a parameter. 
The tables are linked with a record number, which also acts as an index. The 
database can be searched or sorted by using data in the tables and is stored in 
any suitable storage medium, such as floppy disk, CD rom disk, hard drive or 
other suitable medium. 
30 As used herein, a bar codes refers any array of optically readable marks 

of any desired size and shape that ar arranged in a reference context or frame 
of, preferably, although not necessarily, one or more columns and one or more 
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rows. For purposes herein, the bar code refers to any symbology, not necessary 
"bar" but may include dots, characters or any symbol or symbols. 

As used herein, symbology refers to an identifier code or symbol, such as 
a bar code, that is linked to a sample. The index will reference each such 
5 symbology. The symbology is any code known or designed by the user. The 
symbols are associated with information stored in the database. For example, 
each sample can be uniquely identified with an encoded symbology. The 
parameters, such as the answers to the questions and subsequent genotypic and 
other information obtained upon analysis of the samples is included in the 
10 database and associated with the symbology. The database is stored on any 
suitable recording medium, such as a hard drive, a floppy disk, a tape, a CD 
ROM, a DVD disk and any other suitable medium. 
DATABASES 

Human genotyping is currently dependent on collaborations with 
15 hospitals, tissues banks and research institutions that provide samples of disease 
tissue. This approach is based on the concept that the onset and/or progression 
of diseases can be correlated with the presence of a polymorphisms or other 
genetic markers. This approach does not consider that disease correlated with 
the presence of specific markers and the absence of specific markers. It is 
20 shown herein that identification and scoring of the appearance and 

disappearance of markers is possible only if these markers are measured in the 
background of healthy subjects where the onset of disease does not mask the 
change in polymorphism occurrence. Databases of information from disease 
populations suffer from small sample size, selection bias and heterogeneity. The 
25 databases provided herein from healthy populations solve these problems by 
permitting large sample bands, simple selection methods and diluted 
heterogeneity. 

Provided herein are first databases of parameters, associated with non- 
selected, particularly healthy, subjects. Also provided are combinations of the 
30 databases with indexed samples obtained from each of the subjects. Further 
provided are databases produced from the first databases. These contain in 
addition to the original parameters information, such as genotypic informati n. 



'scroop*- o-trrosTAc i > 



WO 01/27857 



PCT/US00/28413 



-18- 

including, but are not limited to, genomic sequence information, derived from the 
samples. 

The databases, which are herein designated healthy databases, are 
so-designated because they are not obtained from subjects pre-selected for a 
5 particular disease. Hence, although individual members may have a disease, the 
collection of individuals is not selected to have a particular disease. 

The subjects from whom the parameters are obtained comprise either a 
set of subjects who are randomly selected across, preferably, all populations, or 
are pre-selected to be disease-free or healthy. As a result, the database is not 

10 selected to be representative of any pre-selected phenotype, genotype, disease 
or other characteristic. Typically the number of subjects from which the 
database is prepared is selected to produce statistically significant results when 
used in the methods provided herein. Preferably, the number of subjects will be 
greater than 100, more preferably greater than 200, yet more preferably greater 

15 than 1000. The precise number can be empirically determined based upon the 
frequency of the parameter(s) that be used to sort the database. Generally the 
population can have at least 50, at least 100, at least 200, at least 500, at least 
1000, at least 5000 or at least 10,000 or more subjects. 

Upon identification of a collection of subjects, information about each 

20 subject is recorded and associated with each subject as a database. The 

information associated with each of the subjects, includes, but is not limited to, 
information related to historical characteristics of the subjects, phenotypic 
characteristics and also genotypic characteristics, medical characteristics and 
any other traits and characteristics about the subject that can be determined. 

25 This information will serve as the basis for sorting the database. 

In an exemplary embodiment, the subjects are mammals, such as 
humans, and the information relates to one or more of parameters, such as age, 
sex, medical history, ethnicity and any other factor. Such information, when the 
animals are humans, for example, can be obtained by a questionnaire, and by 

30 observations about the individual, such as hair color, eye color and other 

characteristics. Genotypic information will be obtained from tissue or other body 
and body fluid samples from the subject. 
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The healthy genomic database can include profiles and polymorphisms 
from healthy individuals from a library of blood samples where each sample in 
the library is an individual and separate blood or other tissue sample. Each 
sample in the database is profiled as to the sex, age, ethnic group, and disease 
5 history of the donor. 

The databases are generated by first identifying healthy populations of 
subjects and obtaining information about each subject that will serve as the 
sorting parameters for the database. This information is preferably entered into 
a storage medium, such as the memory of a computer. 

10 The information obtained about each subject in a population used for 

generating the database is stored in a computer memory or other suitable 
storage medium. The information is linked to an identifier associated with each 
subject. Hence the database will identify a subject, for example by a datapoint 
representative of a bar code, and then all information, such as the information 

15 from a questionnaire, regarding the individual is associated with the datapoint. 
As the information is collected the database is generated. 

Thus, for example, profile information, such as subject histories obtained 
from questionnaires, is collected in the database. The resulting database can be 
sorted as desired, using standard software, such as by age, sex and/or ethnicity. 

20 An exemplary questionnaire for subjects from whom samples are to be obtained 
is shown in Figures 22A-D. Each questionnaire preferably is identified by a bar 
code, particularly a machine readable bar code for entry into the database. After 
a subject provides data and is deemed to be healthy </.e., meets standards for 
blood donation), the data in the questionnaire is entered into the database and is 

25 associated with the bar code. A tissue, cell or blood sample is obtained from the 
subject. 

Figure 4 exemplifies processing and tracking of blood sample 
components. Each component is tracked with a bar code, dated, is entered into 
the database and associated with the subject and the profile of the subject. 
30 Typically, the whole blood is centrifuged to produce plasma, red blood cells 

(which pellet) and leukocytes found in the buffy coat which layers in between. 



WO 01/27857 



PCT/USOO/28413 



-20- 

Various samples are obtained and coded with a bar code and stored for use as 
needed. 

Samples are collected from the subjects. The samples include, but are 
not limited to, tissues, cells, and fluids, such as nucleic acid, blood, plasma, 
5 amniotic fluid, synovial fluid, urine, saliva, aqueous humor, sweat, sperm 
samples and cerebral spinal fluid. It is understood that the particular set of 
samples depends upon the organisms in the population. 

Once samples are obtained the collection can be stored and, in preferred 
embodiments, each sample is indexed with an identifier, particularly a machine 
10 readable code, such as a bar code. For analyses, the samples or components of 
the samples, particularly biopolymers and small molecules, such as nucleic acids 
and/or proteins and metabolites, are isolated. 

After samples are analyzed, this information is entered into the database 
m the memory of the storage medium and associated with each subject. This 
15 information includes, but is not limited to, genotypic information. Particularly, 
nucleic acid sequence information and other information indicative of 
polymorphisms, such as masses of PCR fragments, peptide fragment sequences 
or masses, spectra of biopolymers and small molecules and other indicia of the 
structure or function of a gene, gene product or other marker from which the 
20 existence of a polymorphism within the population can be inferred. 

In an exemplary embodiment, a database can be derived from a collection 
of blood samples. For example, Figure 1 (see, also Figure 10) shows the status 
of a collection of over 5000 individual samples. The samples were processed in 
the laboratory following SOP (standard operating procedure) guidelines. Any 
25 standard blood processing protocol may be used. 

For the exemplary database described herein, the following criteria were 
used to select subjects: 

No testing is done for infectious agents. 
Age: At least 1 7 years old 
30 Weight: Minimum of 1 10 pounds 

Permanently Disqualified: 

History of hepatitis (after age 1 1 ) 
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Leukemia Lymphoma 

Human immunodeficiency virus (HIV), AIDS 
Chronic kidney disease 
Temporarily Disqualified: 
5 Pregnancy - until six weeks after delivery, miscarriage or abortion 

Major surgery or transfusions - for one year 
Mononucleosis - until complete recovery 
Prior whole blood donation - for eight weeks 

Antibiotics by injection for one week; by mouth, for forty-eight hours, 
10 except antibiotics for skin complexion; 

5 year Deferment: 

Internal cancer and skin cancer if it has been removed, is healed and 
there is no recurrence 
These correspond to blood bank criteria for donating blood and represent a 
15 healthy population as defined herein for a human healthy database. 
Structure of the database 

Any suitable database structure and format known to those of skill in the 
art may be employed. For example, a relational database is a preferred format in 
which data is stored as matrices or tables of the parameters linked by an indexer 
20 that identifies each subject. Software for preparing and manipulating, including 
sorting the database, can be readily developed or adapted from commercially 
available software, such as Microsoft Access. 
Quality control 

Quality control procedures can be implemented. For example, after 
25 collection of samples, the quality of the collection in the bank can be assessed. 
For example, mix-up of samples can be checked by testing for known markers, 
such as sex. After samples are separated by ethnicity, samples are randomly 
tested for a marker associated with a particular ethnicity, such as HLA DQA1 
group specific component, to assess whether the samples have been properly 
30 sorted by ethnic group. An exemplary sample bank is depicted in Figure 4. 
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Obtaining genotypic data and other parameters for the database 

After informational and historical parameters are entered into the 

database, material from samples obtained from each subject, is analyzed. 

Analyzed material include proteins, metabolites, nucleic acids, lipids and any 
5 other desired constituent of the material. For example, nucleic acids, such as 

genomic DNA, can be analyzed by sequencing. 

Sequencing can be performed using any method known to those of skill in 

the art. For example, if a polymorphism is identified or known, and it is desired 

to assess its frequency or presence among the subjects in the database, the 
10 region of interest from each sample can be isolated, such as by PCR or 

restriction fragments, hybridization or other suitable method known to those of 

skill in the art and sequenced. For purposes herein, sequencing analysis is 

preferably effected using mass spectrometry (see, e.g., U.S. Patent Nos. 

5,547,835, 5,622,824, 5,851,765, and 5,928,906). Nucleic acids can also be 
15 sequence by hybridization (see, e.g., U.S. Patent Nos. 5,503,980, 5,631,134, 

5,795,714) and including analysis by mass spectrometry (see, U.S. application 

Serial Nos. 08/419,994 and 09/395,409). 

In other detection methods, it is necessary to first amplify prior to 

identifying the allelic variant. Amplification can be performed, e.g., by PCR 
20 and/or LCR, according to methods known in the art. In one embodiment, 

genomic DNA of a cell is exposed to two PCR primers and amplification for a 

number of cycles sufficient to produce the required amount of amplified DNA. In 

preferred embodiments, the primers are located between 150 and 350 base pairs 

apart. 

25 Alternative amplification methods include: self sustained sequence 

replication (Guatelli, J. C. et al., 1990, Proc. Natl. Acad. Sci. U.S.A. 
87:1874-1878), transcriptional amplification system (Kwoh, D. Y. et al., 1989, 
Proc. Natl. Acad. Sci. U.S.A. 86:1 173-1 177), Q-Beta Replicase (Lizardi, P. M. et 
al., 1988, Bio/Technology 6:1 197), or any other nucleic acid amplification 

30 method, followed by the detection of the amplified molecules using techniques 
well known to those of skill in the art. These detection schemes are especially 
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useful for the detection of nucleic acid molecules if such molecules are present 
in very low numbers. 

Nucleic acids can also be analyzed by detection methods and protocols, 
particularly those that rely on mass spectrometry (see, e.g., U.S. Patent No. 
5 5,605,798, 6,043,031, allowed copending U.S. application Serial No. 

08/744,481, U.S. application Serial No. 08/990,851 and International PCT 
application No. WO 99/31273, International PCT application No. WO 98/20019). 
These methods can be automated (see, e.g., copending U.S. application Serial 
No. 09/285,481 and published International PCT application No. 
10 PCT/US00/081 1 1, which describes an automated process line). Preferred 

among the methods of analysis herein are those involving the primer oligo base 
extension (PROBE) reaction with mass spectrometry for detection (described 
herein and elsewhere, see e.g., U.S. Patent No. 6,043,031; see, also U.S. 
application Serial Nos. 09/287,681, 09/287,682, 09/287,141 and 09/287,679, 
15 allowed copending U.S. application Serial No. 08/744,481, International PCT 

application No. PCT/US97/20444, published as Internationa! PCT application No. 
WO 98/20019, and based upon U.S. application Serial Nos. 08/744,481, 
08/744,590, 08/746,036, 08/746,055, 08/786,988, 08/787,639, 08/933,792, 
08/746,055, 08/786,988 and 08/787,639; see, also U.S. application Serial No. 
20 09/074,936, U.S. Patent No. 6,024,925, and U.S. application Serial Nos. 

08/746,055 and 08/786,988, and published International PCT application No. 
WO 98/20020) 

A preferred format for performing the analyses is a chip based format in 
which the biopolymer is linked to a solid support, such as a silicon or silicon- 

25 coated substrate, preferably in the form of an array. More preferably, when 
analyses are performed using mass spectrometry, particularly MALDI, small 
nanoliter volumes of sample are loaded on, such that the resulting spot is about, 
or smaller than, the size of the laser spot. It has been found that when this is 
achieved, the results from the mass spectrometry analysis are quantitative. The 

30 area under the signals in the resulting mass spectra ar proportional to 

concentration (when normalized and corrected for background). Methods for 
preparing and using such chips are described in U.S. Patent No. 6,024,925, co- 
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pending U.S. application Serial Nos. 08/786,988, 09/364,774, 09/371,150 and 
09/297,575; see, also U.S. application Serial No. PCT/US97/201 95, which 
published as WO 98/20020. Chips and kits for performing these analyses are 
commercially available from SEQUENOM under the trademark MassARRAY. 
5 MassArray relies on the fidelity of the enzymatic primer extension reactions 
combined with the miniaturized array and MALDI-TOF (Matrix-Assisted Laser 
Desorption lonization-Time of Flight) mass spectrometry to deliver results rapidly. 
It accurately distinguishes single base changes in the size of DNA fragments 
associated with genetic variants without tags. 

10 The methods provided herein permit quantitative determination of alleles. 

The areas under the signals in the mass spectra can be used for quantitative 
determinations. The frequency is determined from the ratio of the signal to the 
total area of all of the spectrum and corrected for background. This is possible 
because of the PROBE technology as described in the above applications 

15 incorporated by reference herein. 

Additional methods of analyzing nucleic acids include amplification- based 
methods including polymerase chain reaction (PCR), ligase chain reaction (LCR), 
mini-PCR, rolling circle amplification, autocatalytic methods, such as those using 
Qfi replicase, TAS, 3SR, and any other suitable method known to those of skill 

20 in the art. 

Other methods for analysis and identification and detection of 
polymorphisms, include but are not limited to, allele specific probes. Southern 
analyses, and other such analyses. 

The methods described below provide ways to fragment given amplified 

25 or non-amplified nucleotide sequences thereby producing a set of mass signals 
when mass spectrometry is used to analyze the fragment mixtures. 
Amplified fragments are yielded by standard polymerase chain methods (US 
4,683,195 and 4,683,202). The fragmentation method involves the use of 
enzymes that cleave single or double strands of DNA and enzymes that ligate 

30 DNA. The cleavage enzymes can be glycosylases, nickases, and site-specific 
and non site-specific nucleases with the most preferred enzymes being 
glycosylases, nickases, and site-specific nucleases. 
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Glycosylase Fragmentation Method 

DNA glycosylases specifically remove a certain type of nucleobase from a 
given DNA fragment. These enzymes can thereby produce abasic sites, which 
can be recognized either by another cleavage enzyme, cleaving the exposed 
5 phosphate backbone specifically at the abasic site and producing a set of 

nucleobase specific fragments indicative of the sequence, or by chemical means, 
such as alkaline solutions and or heat. The use of one combination of a DNA 
glycosylase and its targeted nucleotide would be sufficient to generate a base 
specific signature pattern of any given target region. 

10 Numerous DNA glcosylases are known, For example, a DNA glycosylase 

can be uracil-DNA glycolsylase (UDG) , 3-methyladenine DNA glycosylase, 3- 
methyladenine DNA glycosylase II, pyrimidine hydrate-DNA glycosylase, FaPy- 
DNA glycosylase, thymine mismatch-DNA glycosylase, hypoxanthine-DNA 
glycosylase, 5-Hydroxymethyluracil DNA glycosylase (HmUDG), 5- 

15 Hydroxymethylcytosine DNA glycosylase, or 1 ,N6-ethenoadenine DNA 

glycosylase (see, e.g.., U.S. Patent Nos. 5,536,649, 5,888, 795, 5,952,176 
and 6,099,553, International PCT application Nos. WO 97/03210, 
W0 99/54501; see, also, Eftedal et al. (1993) Nucleic Acids Res 21:2095-2101, 
Bjelland and Seeberg (1987) Nucleic Acids Res. 15:2787-2801, Saparbaev et al. 

20 (1995) Nucleic Acids Res. 23:3750-3755, Bessho (1999) Nucleic Acids Res. 
27:979-983) corresponding to the enzyme's modified nucleotide or nucleotide 
analog target. A preferred glycosylase is uracil-DNA glycolsylase (UDG). 

Uracil, for example, can be incorporated into an amplified DNA molecule 
by amplifying the DNA in the presence of normal DNA precursor nucleotides 

25 (e.g. dCTP, dATP, and dGTP) and dUTP. When the amplified product is treated 
with UDG, uracil residues are cleaved. Subsequent chemical treatment of the 
products from the UDG reaction results in the cleavage of the phosphate 
backbone and the generation of nucleobase specific fragments. Moreover, the 
separation of the complementary strands of the amplified product prior to 

30 glycosylase treatment allows complementary patterns of fragmentation to b 
generated. Thus, the use of dUTP and Uracil DNA glycosylase allows the 
generation of T specific fragments for the complementary strands, thus providing 
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information on the T as well as the A positions within a given sequence. Similar 
to this, a C-specific reaction on both (complementary) strands (i.e. with a C- 
specific glycosylase) yields information on C as well as G positions within a 
given sequence if the fragmentation patterns of both amplification strands are 
5 analyzed separately. Thus, with the glycosylase method and mass 

spectrometry, a full series of A, C, G and T specific fragmentation patterns can 
be analyzed. 

Nickase Fragmentation Method 

A DNA nickase, or DNase, can be used recognize and cleave one strand 
10 of a DNA duplex. Numerous nickases are known. Among these, for example, 
are nickase NY2A nickase and NYS1 nickase (Megabase) with the following 
cleavage sites: 

NY2A: 5'...R AG...3' 

3\..Y TC...5' where R = A or G and Y = C or T 
15 NYS1: 5'... CC[A/G/T]...3' 

3'... GG[T/C/A]...5'. 
Fen-Ligase Fragmentation Method 

The Fen-ligase method involves two enzymes: Fen-1 enzyme and a ligase. 
The Fen-1 enzyme is a site-specific nuclease known as a "flap" endonuclease 
20 (US 5,843,669, 5,874,283, and 6,090,606). This enzymes recognizes and 

cleaves DNA "flaps" created by the overlap of two oligonucleotides hybridized to 
a target DNA strand. This cleavage is highly specific and can recognize single 
base pair mutations, permitting detection of a single homologue from an 
individual heterozygous at one SNP of interest and then genotyping that 
25 homologue at other SNPs occurring within the fragment. Fen-1 enzymes can be 
Fen-1 like nucleases e.g. human, murine, and Xenopus XPG enzymes and yeast 
RAD2 nucleases or Fen-1 endonucleases from, for example, M. jannaschii, P. 
furiosus, and P. woesei. Among preferred enzymes are the Fen-1 enzymes. 
The ligase enzyme forms a phosphodiester bond between two double 
30 stranded nucleic acid fragments. The ligase can be DNA Ligase I or DNA Ligase 
III (see, e.g., U.S. Patent Nos. US 5,506,137, 5,700,672, 5,858,705 and 
5,976,806; see, also, Waga, eta/. (1994) J. Biol. Chem. 269:10923-10934, Li 
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et al. (1994) Nucleic Acids Res. 22:632-638, Arrand et a!. (1986) J. Biol. Chem. 
261:9079-9082, Lehman (1974) Science 186:790-797, Higgins and Cozzarelli 
(1979) Methods Enzymol. 68:50-71, Lasko et ai. (1990) Mutation Res. 
236:277-287, and Lindahl and Barnes (1992) Ann. Rev. Biochem. 61:251-281). 
5 Thermostable ligase (Epicenter Technologies), where "thermostable" 

denotes that the ligase retains activity even after exposure to temperatures 
necessary to separate two strands of DNA, are among preferred ligases for use 
herein. 

Type IIS Enzyme Fragmentation Method 

10 Restriction enzymes bind specifically to and cleave double-stranded DNA 

at specific sites within or adjacent to a particular recognition sequence. These 
enzymes have been classified into three groups (e.g. Types I, II, and III) as 
known to those of skill in the art. Because of the properties of type I and type III 
enzymes, they have not been widely used in molecular biological applications. 

15 Thus, for this invention type II enzymes are preferred. Of the thousands of 
restriction enzymes known in the arts, there are 179 different type II 
specificities. Of the 179 unique type II restriction endonucleases, 31 have a 4- 
base recognition sequence, 11 have a 5-base recognition sequence, 127 have a 
6-base recognition sequence, and 10 have recognition sequences of greater than 

20 six bases (US 5,604,098). Of category type II enzymes, type IIS is preferred. 

Type IIS enzymes can be Alw XI, Bbv I, Bee 83, Bpm I, Bsg I, Bsm Al, 
Bsm Fl. Bsa I, Bcc I, Beg I, Ear I, Eco 57I, Esp 31, Fau I, Fok I, Gsu I, Hga I, Mme 
I, Mbo II, Sap I, and the like. The preferred type IIS enzyme is Fok I. 

The Fok I enzyme endonuclease is an exemplary well characterized 

25 member of the Type IIS class (see, e.g., U.S. Patent Nos. 5,714,330, 

5,604,098, 5,436,1 50, 6,054,276 and 5,871 ,911; see, also, Szybalski et al. 
(1991) Gene 100:13-26, Wilson and Murray (1991) Ann. Rev. Genet. 25:585- 
627, Sugisaki et al. (1981) Gene 16:73-78, Podhajska and Szalski (1985) Gene 
40:175-182. Fok I recognizes the sequence 5'GGATG-3' and cleaves DNA 

30 accordingly. Type IIS restriction sites can be introduced into DNA targets by 
incorporating the site into primers used to amplify such targets. Fragments 
produced by digestion with Fok I are site specific and can be analyzed by mass 
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spectrometry methods such as MALDI-TOF mass spectrometry, ESI-TOF mass 
spectrometry, and any other type of mass spectrometry well known to those of 
skill in the art. 

Once a polymorphism has been found to correlatate with a parameter 
5 such as age. The possibility of false results due to allelic dropout is examined by 
doing comparative PCR in an adjacent region of the genome. 
Analyses 

In using the database, allelic frequencies can be determined across the 
population by analyzing each sample in the population individually, determining 

10 the presence or absence of allele or marker of interest in each individual sample, 
and then determining the frequency of the marker in the population. The 
database can then be sorted (stratified) to identify any correlations between the 
allele and a selected parameter using standard statistical analysis. If a 
correlation is observed, such as a decrease in a particular marker with age or 

15 correlation with sex or other parameter, then the marker is a candidate for 

further study, such as genetic mapping to identify a gene or pathway in which it 
is involved. The marker may then be correlated, for example, with a disease. 
Haplotying can also be carried out. Genetic mapping can be effected using 
standard methods and may also require use of databases of others, such as 

20 databases previously determined to be associated with a disorder. 

Exemplary analyses have been performed and these are shown in the 
figures, and discussed herein. 

Sample pooling 

It has been found that using the databases provided herein, or any other 
25 database of such information, substantially the same frequencies that were 
obtained by examining each sample separately can be obtained by pooling 
samples, such as in batches of 10, 20, 50, 100, 200, 500, 1000 or any other 
number. A precise number may be determined empirically if necessary, and can 
be as low as 3. 
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In one embodiment, the frequency of genotypic and other markers can be 
obtained by pooling samples. To do this a target population and a genetic 
variation to be assessed is selected, a plurality of samples of biopolymers are 
obtained from members of the population, and the biopolymer from which the 
5 marker or genotype can be inferred is determined or detected. A comparison of 
samples tested in pools and individually and the sorted results therefrom are 
shown in Figure 9, which shows frequency of the factor VII Allele 353Q. Figure 
10 depicts the frequency of the CETP Allele CETP in pooled versus individual 
samples. Figure 1 5 shows ethnic diversity among various ethnic groups in the 

10 database using pooled DNA samples to obtain the data. Figures 12-14 show 
mass spectra for these samples. 

Pooling of test samples has application not only to the healthy databases 
provided herein, but also to use in gathering data for entry into any database of 
subjects and genotypic information, including typical databases derived from 

15 diseased populations. What is demonstrated herein, is the finding that the 

results achieved are statistically the same as the results that would be achieved 
if each sample is analyzed separately. Analysis of pooled samples by a method, 
such as the mass spectrometric methods provided herein, permits resolution of 
such data and quantitation of the results. 

20 For factor VII the R53Q acid polymorphism was assessed. In Figure 9, 

the "individual" data represent allelic frequency observed in 92 individuals 
reactions. The pooled data represent the allelic frequency of the same 92 
individuals pooled into a single probe reaction. The concentration of DNA in the 
samples of individual donors is 250 nanograms. The total concentration of DNA 

25 in the pooled samples is also 250 nanograms, where the concentration of any 
individual DNA is 2.7 nanograms. 

It also was shown that it is possible to reduce the DNA concentration of 
individuals in a pooled samples from 2.7 nanograms to 0.27 nanograms without 
any change in the quality of the spectrum or the ability to quantitate the amount 

30 of sample detected. Hence low concentrations of sample may be used in the 
pooling methods. 
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Use of the databases and markers identified thereby 

The successful use of genomics requires a scientific hypothesis {i.e., 
common genetic variation, such as a SNP), a study design {i.e., complex 
disorders), samples and technology, such as the chip-based mass spectrometry 
5 analyses (see, e.g., U.S. Patent No. 5,605,798, U.S. Patent No. 5,777,324, 
U.S. Patent No. 6,043,031, allowed copending U.S. application Serial No. 
08/744,481, U.S. application Serial No. 08/990,851, International PCT 
application No. WO 98/20019, copending U.S. application Serial No. 
09/285,481, which describes an automated process line for analyses; see, also, 

10 U.S. application Serial Nos. 08/617,256, 09/287,681, 09/287,682, 09/287,141 
and 09/287,679, allowed copending U.S. application Serial No. 08/744,481, 
International PCT application No. PCT/US97/20444, published as International 
PCT application No. WO 98/20019, and based upon U.S. application Serial Nos. 
08/744,481, 08/744,590, 08/746,036, 08/746,055, 08/786,988, 08/787,639, 

15 08/933,792, 08/746,055, 09/266,409, 08/786,988 and 08/787,639; see, also 
U.S. application Serial No. 09/074,936). All of these aspects can be used in 
conjunction with the databases provided herein and samples in the collection. 

The databases and markers identified thereby can be used, for example, 
for identification of previously unidentified or unknown genetic markers and to 

20 identify new uses for known markers. As markers are identified, these may be 
entered into the database to use as sorting parameters from which additional 
correlations may be determined. 

Previously unidentified or unknown genetic markers 

The samples in the healthy databases can be used to identify new 

25 polymorphisms and genetic markers, using any mapping, sequencing, 

amplification and other methodologies, and in looking for polymorphisms among 
the population in the database. The thus-identified polymorphism can then be 
entered into the database for each sample, and the database sorted (stratified) 
using that polymorphism as a sorting parameter to identify any patterns and 

30 correlations that emerge, such as age correlated changes in the frequency of the 
identified marker. If a correlation is identified, the locus of the marker can be 
mapped and its function or effect assessed or deduced. 
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Thus, the databases here provide means for: 

identification of significantly different allelic frequencies of genetic factors 
by comparing the occurrence or disappearance of the markers with increasing 
age in population and then associating the markers with a disease or a 
5 biochemical pathway; 

identification of significantly different allelic frequencies of disease 
causing genetic factors by comparing the male with the female population or 
comparing other selected stratified populations and associating the markers with 
a disease or a biochemical pathway; 
10 identification of significantly different allelic frequencies of disease 

causing genetic factors by comparing different ethnic groups and associating the 
markers with a disease or a biochemical pathway that is known to occur in high 
frequency in the ethnic group; 

profiling potentially functional variants of genes through the general 
15 panmixed population stratified according to age, sex, and ethnic origin and 
thereby demonstrating the contribution of the variant genes to the physical 
condition of the investigated populations- 
identification of functionally relevant gene variants by gene disequilibrium 
analysis performed within the general panmixed population stratified according 
20 to age, sex, and ethnic origin and thereby demonstrating their contribution to the 
physical condition of investigated population; 

identification of potentially functional variants of chromosomes or parts of 
chromosomes by linkage disequilibrium analysis performed within the general 
panmixed population stratified according to age, sex, and ethnic origin and 
25 thereby demonstrating their contribution to the physical condition of investigated 
population. 

Uses of the identified markers and known markers 

The databases may also be used in conjunction with known markers and 
sorted to identify any correlations. For example, the databases can be used for: 
30 determination and evaluation of the penetrance of medically relevant 

polymorphic markers; 
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determination and evaluation of the diagnostic specificity of medically 
relevant genetic factors; 

determination and evaluation of the positive predictive value of medically 
relevant genetic factors; 
5 determination and evaluation of the onset of complex diseases, such as, 

but are not limited to, diabetes, hypertension, autoimmune diseases, 
arteriosclerosis, cancer and other diseases within the general population with 
respect to their causative genetic factors; 

delineation of the appropriate strategies for preventive disease treatment; 
10 delineation of appropriate timelines for primary disease intervention; 

validation of medically relevant genetic factors identified in isolated 
populations regarding their general applicability; 

validation of disease pathways including all potential target structures 
identified in isolated populations regarding their general applicability; and 
15 validation of appropriate drug targets identified in isolated populations 

regarding their general applicability. 

Among the diseases and disorders for which polymorphisms may be 
linked include, those linked to inborn errors of metabolism, acquired metabolic 
disorders, intermediary metabolism, oncogenesis pathways, blood clotting 
20 pathways, and DNA synthetic and repair pathways DNA 

repair/replication/transcription factors and activities, e.g., such as genes related 
to oncogenesis, aging and genes involved in blood clotting and the related 
biochemical pathways that are related to thrombosis, embolism, stroke, 
myocardial infarction, angiogenesis and oncogenesis. 
25 For example, a number of diseases are caused by or involve deficient or 

defective enzymes in intermediary metabolism <see, e.g. . Tables 1 and 2, below) 
that result, upon ingestion of the enzyme substrates, in accumulation of harmful 
metabolites that damage organs and tissues, particularly an infant's developing 
brain and other organs, resulting in mental retardation and other developmental 
30 disorders. 
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Identification of markers and genes for such disorders is of great interest. 
Model systems 

Several gene systems, p21, p53 and Lipoprotein Lipase polymorphism 
(N291S), were selected. The p53 gene is a tumor suppressor gene that is 
5 mutated in diverse tumor types. One common allelic variant occurs at codon 72. 
A polymorphism that has been identified in the p53 gene, i.e., the R72P allele, 
results in an amino acid exchange, arginine to proline, at codon 72 of the gene. 

Using diseased populations, it has been shown that there are ethnic 
differences in the allelic distribution of these alleles among African-Americans 
10 and Caucasians in the U.S. The results here support this finding and also 

demonstrate that the results obtained with a healthy database are meaningful 
(see. Figure 7B). 

The 29 1S allele leads to reduced levels of high density lipoprotein 
cholesterol (HDL-C) that is associated with an increased risk of males for 
15 arteriosclerosis and in particular myocardial infarction (see, Reymer et aL (1995) 
Nature Genetics 70:28-34). 

Both genetic polymorphisms were profiled within a part of the Caucasian 
population-based sample bank. For the polymorphism located in the lipoprotein 
lipase gene a total of 1025 unselected individuals (436 males and 589 females) 
20 were tested. Genomic DNA was isolated from blood samples obtained from the 
individuals. 

As shown in the Examples and figures, an exemplary database containing 
about 5000 subjects, answers to the questionnaire (see Figure 3), and genotypic 
information has been stratified. A particular known allele has been selected, and 

25 the samples tested for the marker using mass spectrometric analyses, 

particularly PROBE (see the EXAMPLES) to identify polymorphisms in each 
sample. The population in the database has been sorted according to various 
parameters and correlations have been observed. For example, FIGURES 2A-C, 
show sorting of the data by age and sex for the Lipoprotein Lipase gene in the 

30 Caucasian population in the database. The results show a decrease in the 

frequency of the allele with age in males but no such decrease in females. Other 
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alleles that have been tested against the database, include, alleles of p53, p21 
and factor VII. Results when sorted by age are shown in the figures. 

These examples demonstrate an effect of altered frequency of disease 
causing genetic factors within the general population. The scientific 
5 interpretation of those results allows prediction of medical relevance of 

polymorphic genetic alterations. In addition, conclusions can be drawn with 
regard to their penetrance, diagnostic specificity, positive predictive value, onset 
of disease, most appropriate onset of preventive strategies, and the general 
applicability of genetic alterations identified in isolated populations to panmixed 
10 populations. 

Therefore, an age- and sex-stratified population-based sample bank that is 
ethnically homogenous is a suitable tool for rapid identification and validation of 
genetic factors regarding their potential medical utility. 

Exemplary computer system for creating, storing and processing the databases 
1 5 Systems 

Systems, including computers, containing the databases are provided 
herein. The computers and databases can be used in conjunction, for example, 
with the APL system (see, copending U.S. application Serial No. 09/285,481), 

20 which is an automated system for analyzing biopolymers, particularly nucleic 
acids. Results from the APL system can be entered into the database. 

Any suitable computer system may be used. The computer system may 
be integrated into systems for sample analysis, such as the automated process 
line described herein (see, e.g., copending U.S. application Serial No. 

25 09/285,481). 

Figure 1 7 is a block diagram of a computer constructed in to provide and 
process the databases described herein. The processing that maintains the 
database and performs the methods and procedures may be performed on 
multiple computers all having a similar construction, or may be performed by a 

30 single, integrated computer. For example, the computer through which data is 
added to the database may be separate from the computer through which the 
database is sorted, or may be integrated with it. In either arrangement, the 
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computers performing the processing may have a construction as illustrated in 
Figure 1 7. 

Figure 1 7 is a block diagram of an exemplary computer 1 700 that 
maintains the database described above and performs the methods and 
5 procedures. Each computer 1700 operates under control of a central processor 
unit (CPU) 1702, such as a "Pentium" microprocessor and associated integrated 
circuit chips, available from Intel Corporation of Santa Clara, California, USA. A 
computer user can input commands and data from a keyboard and display 
mouse 1704 and can view inputs and computer output at a display 1 706. The 

10 display is typically a video monitor or flat panel display device. The computer 

1 700 also includes a direct access storage device (DASD) 1707, such as a fixed 
hard disk drive. The memory 1708 typically comprises volatile semiconductor 
random access memory (RAM). Each computer preferably includes a program 
product reader 1710 that accepts a program product storage device 1712, from 

15 which the program product reader can read data {and to which it can optionally 
write data). The program product reader can comprise, for example, a disk 
drive, and the program product storage device can comprise removable storage 
media such as a magnetic floppy disk, an optical CD-ROM disc, a CD-R disc, a 
CD-RW disc, or a DVD data disc. If desired, the computers can be connected so 

20 they can communicate with each other, and with other connected computers, 
over a network 1713. Each computer 1700 can communicate with the other 
connected computers over the network 1713 through a network interface 1714 
that enables communication over a connection 1716 between the network and 
the computer. 

25 The computer 1 700 operates under control of programming steps that are 

temporarily stored in the memory 1 708 in accordance with conventional 
computer construction. When the programming steps are executed by the CPU 
1702, the pertinent system components perform their respective functions. 
Thus, the programming steps implement the functionality of the system as 

30 described above. The programming steps can be received from the DASD 1707, 
through the program product reader 1712, or through the network connection 
1716. The storage drive 1710 can receive a program product, read 
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programming steps recorded thereon and transfer the programming steps into 
the memory 1708 for execution by the CPU 1702. As noted above, the 
program product storage device 1710 can comprise any one of multiple 
removable media having recorded computer-readable instructions, including 
5 magnetic floppy disks and CD-ROM storage discs. Other suitable program 

product storage devices can include magnetic tape and semiconductor memory 
chips. In this way, the processing steps necessary for operation can be 
embodied on a program product. 

Alternatively, the program steps can be received into the operating 

10 memory 1708 over the network 1713. In the network method, the computer 
receives data including program steps into the memory 1 708 through the 
network interface 1714 after network communication has been established over 
the network connection 1716 by well-known methods that will be understood by 
those skilled in the art without further explanation. The program steps are then 

15 executed by the CPU 1702 to implement the processing of the Garment 
Database system. 

It should be understood that all of the computers of the system preferably 
have a construction similar to that shown in Figure 17, so that details described 
with respect to the Figure 1 7 computer 1 700 will be understood to apply to all 

20 computers of the system 1700. This is indicated by multiple computers 1700 
shown connected to the network 1713. Any one of the computers 1700 can 
have an alternative construction, so long as they can communicate with the 
other computers and support the functionality described herein. 

Figure 1 8 is a flow diagram that illustrates the processing steps 

25 performed using the computer illustrated in Figure 17, to maintain and provide 

access to the databases, such as for identifying polymorphic genetic markers. In 
particular, the information contained in the database is stored in computers 
having a construction similar to that illustrated in Figure 17. The first step for 
maintaining the database, as indicated in Figure 18, is to identify healthy 

30 members of a population. As noted above, the population members are subj cts 
that are selected only on the basis of being healthy, and where the subjects are 
mammals, such as humans, they are preferably selected based upon apparent 
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health and the absence of detectable infections. The step of identifying is 
represented by the flow diagram box numbered 1802. 

The next step, represented by the flow diagram box numbered 1 804, is 
to obtain identifying and historical information and data relating to the identified 
5 members of the population. The information and data comprise parameters for 
each of the population members, such as member age, ethnicity, sex, medical 
history, and ultimately genotypic information. Initially, the parameter information 
is obtained from a questionnaire answered by each member, from whom a body 
tissue or body fluid sample also is obtained. The step of entering and storing 

10 these parameters into the database of the computer is represented by the flow 
diagram box numbered 1806. As additional information about each population 
member and corresponding sample is obtained, this information can be inputted 
into the database and can serve as a sorting parameter. 

In the next step, represented by the flow diagram box numbered 1808, 

15 the parameters of the members are associated with an indexer. This step may 

be executed as part of the database storage operation, such as when a new data 
record is stored according to the relational database structure and is 
automatically linked with other records according to that structure. The step 
1 806 also may be executed as part of a conventional data sorting or retrieval 

20 process, in which the database entries are searched according to an input search 
or indexing key value to determine attributes of the data. For example, such 
search and sort techniques may be used to follow the occurrence of known 
genetic markers and then determine if there is a correlation with diseases for 
which they have been implicated. Examples of this use are for assessing the 

25 frequencies of the p53 and Lipoprotein Lipase polymorphisms. 

Such searching of the database also may be valuable for identifying one 
or more genetic markers whose frequency changes within the population as a 
function of age, ethnic group, sex, or some other criteria. This can allow the 
identification of previously unknown polymorphisms and, ultimately, 

30 identification of a gene or pathway involved in the onset and progression of 
disease. 



WO 01/27857 PCT/USOO/28413 



-38- 

In addition, the database can be used for taking an identified 
polymorphism and ascertaining whether it changes in frequency when the data is 
sorted according to a selected parameter. 

In this way, the databases and methods provided herein permit, among 
5 other things, identification of components, particularly key components, of a 
disease process by understanding its genetic underpinnings, and also an 
understanding of processes, such as individual drug responses. The databases 
and methods provided herein also can be used in methods involving elucidation 
of pathological pathways, in developing new diagnostic assays, identifying new 
10 potential drug targets, and in identifying new drug candidates. 
Morbidity and/or early mortality associated polymorphisms 

A database containing information provided by a population of healthy 
blood donors who were not selected for any particular disease to can be used to 
identify polymorphisms and the alleles in which they are present, whose 
15 frequency decreases with age. These may represent morbidity susceptibility 
markers and genes. 

Polymorphisms of the genome can lead to altered gene function, protein 
function or genome instability. To identify those polymorphisms which have a 
clinical relevance/utility is the goal of a world-wide scientific effort. It can be 
20 expected that the discovery of such polymorphisms will have a fundamental 

impact on the identification and development of novel drug compounds to cure 
diseases. However, the strategy to identify valuable polymorphisms is 
cumbersome and dependent upon the availability of many large patient and 
control cohorts to show disease association. In particular, genes that cause a 
25 general risk of the population to suffer from any disease (morbidity susceptibility 
genes) will escape these case/control studies entirely. 

Here described is a screening strategy to identify morbidity susceptibility 
genes underlying a variety of different diseases. The definition of a morbidity 
susceptibility gene is a gene that is expressed in many different cell types or 
30 tissues (housekeeping gene) and its altered function can facilitate the expression 
of a clinical phenotype caused by disease-specific susceptibility genes that are 
involved in a pathway specific for this disorder. In other words, these morbidity 
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susceptibility genes predispose people to develop a distinct disease according to 
their genetic make-up for this disease. 

Candidates for morbidity susceptibility genes can be found at the bottom 
level of pathways involving transcription, translation, heat-shock proteins, 
5 protein trafficking, DNA repair, assembly systems for subcellular structures (e.g. 
mitochondria, peroxysomes and other cellular microbodies), receptor signaling 
cascades, immunology, etc. Those pathways control the quality of life at the 
cellular level as well as for the entire organism. Mutations/polymorphisms 
located in genes encoding proteins for those pathways can reduce the fitness of 
10 cells and make the organism more susceptible to express the clinical phenotype 
caused by the action of a disease-specific susceptibility gene. Therefore, these 
morbidity susceptibility genes can be potentially involved in a whole variety of 
different complex diseases if not in all. Disease-specific susceptibility genes are 
involved in pathways that can be considered as disease-specific pathways like 

15 glucose-, lipid, hormone metabolism, etc. 

The exemplified method permit, among other things, identification of 
genes and/or gene products involved in a man's general susceptibility to 
morbidity and/or mortality; use of these genes and/or gene products in studies to 
elucidate the genetic underpinnings of human diseases; use of these genes 

20 and/or gene products in combinatorial statistical analyses without or together 
with disease-specific susceptibility genes; use of these genes and/or gene 
products to predict penetrance of disease susceptibility genes; use of these 
genes and/or gene products in predisposition and/or acute medical diagnostics 
and use of these genes and/or gene products to develop drugs to cure diseases 

25 and/or to extend the life span of humans. 
SCREENING PROCESS 

The healthy population stratified by age, gender and ethnicity, etc. is a 
very efficient and a universal screening tool for morbidity associated genes. 
Changes of allelic frequencies in the young compared to the old population are 

30 expected to indicate putative morbidity susceptibility genes. Individual samples 
of this healthy population base can be pooled to further increase the throughput. 
In a proof of principle experiment pools of young and old Caucasian females and 
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males were applied to screen more than 400 randomly chosen single nucleotide 
polymorphisms located in many different genes. Candidate polymorphisms were 
identified if the allelic difference was greater than 8% between young and old for 
both or only one of the genders. The initial results were assayed again in at 
5 least one independent subsequent experiments. Repeated experiments are 
necessary to recognize unstable biochemical reactions, which occur with a 
frequency of about 2-3% and can mimic age-related allelic frequency 
differences. Average frequency differences and standard deviations are 
calculated after successful reproducibility of initial results. The final allelic 

10 frequency is then compared to a reference population of Caucasian CEPH sample 
pool. The result should show similar allelic frequencies in the young Caucasian 
population. Subsequently, the exact allele frequencies of candidates including 
genotype information were obtained by analyzing all individual samples. This 
procedure is straight forward with regard to time and cost. It enables the 

15 screening of an enormous number of SNPs. So far, several markers with a 
highly significant association to age were identified and described below. 

In general at least 5 individual in a stratified population need to be 
screened to produce statistically significant results. The frequency of the allele 
is determined for an age stratified population. Chi square analysis is then 

20 performed on the allelic frequencies to determine if the difference between age 
groups is statistically significant. A p value less than of 0.1 is considered to 
represent a statistically significant difference. More preferably the p value 
should be less than 0.05. 
Clinical Trials 

25 The identification of markers whose frequency in a population decreases 

with age also allows for better designed and balanced clinical trials. Currently, if 
a clinical trial utilizes a marker as a significant endpoint in a study and the 
marker disappears with age, then the results of the study may be inaccurate. By 
using methods provided herein, it can be ascertained that if a marker decreases 

30 in frequency with age. This information considered and controlled when 
designing the study. For, example, an age independent marker could be 
substituted in its place. 
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The following examples are included for illustrative purposes only and are 
not intended to limit the scope of the invention. 

EXAMPLE 1 

This example describes the use of a database containing information 
5 provided by a population of healthy blood donors who were not selected for any 
particular disease to determine the distribution of allelic frequencies of known 
genetic markers with age and by sex in a Caucasian subpopulation of the 
database. The results described in this example demonstrate that a disease- 
related genetic marker or polymorphism can be identified by sorting a healthy 
10 database by a parameter or parameters, such as age, sex and ethnicity. 
Generating a database 

Blood was obtained by venous puncture from human subjects who met 
blood bank criteria for donating blood. The blood samples were preserved with 
EDTA at pH 8.0 and labeled. Each donor provided information such as age, sex, 

15 ethnicity, medical history and family medical history. Each sample was labeled 
with a barcode representing identifying information. A database was generated 
by entering, for each donor, the subject identifier and information corresponding 
to that subject into the memory of a computer storage medium using 
commercially available software, e.g., Microsoft Access. 

20 Model genetic markers 

The frequencies of polymorphisms known to be associated at some level 
with disease were determined in a subpopulation of the subjects represented in 
the database. These known polymporphisms occur in the p21, p53 and 
Lipoprotein Lipase genes. Specifically, the N291S polymorphism (N291S) of the 

25 Lipoprotein Lipase gene, which results in a substitution of a serine for an 

asparagine at amino acid codon 291, leads to reduced levels of high density 
lipoprotein cholesterol (HDL-C) that is associated with an increased risk of males 
for arteriosclerosis and in particular myocardial infarction (see, Reymer et at. 
(1995) Nature Genetics 70:28-34). 

30 The p53 gene encodes a cell cycle control protein that assesses DNA 

damage and acts as a transcription factor regulating genes that c ntrol cell 
growth, DNA repair and apoptosis (programmed cell death). Mutations in the 
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p53 gene have been found in a wide variety of different cancers, including 
different types of leukemia, with varying frequency. The loss of normal p53 
function results in genomic instability an uncontrolled cell growth. A 
polymorphism that has been identified in the p53 gene, i.e., the R72P allele, 
5 results in the substitution of a proline for an arginine at amino acid codon 72 of 
the gene. 

The p21 gene encodes a cyclin-dependent kinase inhibitor associated 
with G1 phase arrest of normal cells. Expression of the p21 gene triggers 
apoptosis. Polymorphisms of the p21 gene have been associated with Wilms' 

10 tumor, a pediatric kidney cancer. One polymorphism of the p21 gene, the S31R 
polymorphism, results in a substitution of an arginine for a serine at amino acid 
codon 31 . 

Database analysis 

Sorting of subjects according to specific parameters 

15 The genetic polymorphisms were profiled within segments of the 

Caucasian subpopulation of the sample bank. For p53 profiling, the genomic 
DNA isolated from blood from a total of 1277 Caucasian subjects age 18-59 
years and 457 Caucasian subjects age 60-79 years was analyzed. For p21 
profiling, the genomic DNA isolated from blood from a total of 910 Caucasian 

20 subjects age 18-49 years and 824 Caucasian subjects age 50-79 years was 

analyzed. For lipoprotein lipase gene profiling, the genomic DNA from a total of 
1464 Caucasian females and 1470 Caucasian males under 60 years of age and 
a total of 478 Caucasian females and 560 Caucasian males over 60 years of age 
was analyzed. 

25 Isolation and analysis of genomic DNA 

Genomic DNA was isolated from blood samples obtained from the 
individuals. Ten milliliters of whole blood from each individual was centrifuged 
at 2000 x g. One milliliter of the buffy coat was added to 9 ml of 1 55 mM 
NH 4 CI, 10 mM KHC0 3 , an <* 0.1 mM Na 2 EDTA, incubated 10 min at room 

30 temperature and centrifuged for 10 min at 2000 x g. The supernatant was 
removed, and the white cell pellet was washed in 155 mM NH 4 CI, 10 mM 
KHCO3 and 0.1 mM Na 2 EDTA and resuspended in 4.5 ml of 50 mM Tris, 5 mM 
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EDTA and 1 % SDS. Proteins were precipitated from the cell lysate by 6 rnM 
ammonium acetate, pH 7.3, and then separated from the nucleic acids by 
centrifugation at 3000 x g. The nucleic acid was recovered from the 
supernatant by the addition of an equal volume of 100% isopropanol and 
5 centrifugation at 2000 x g. The dried nucleic acid pellet was hydrated in 10 mM 
Tris, pH 7.6, and 1 mM Na 2 EDTA and stored at 4° C. 

Assays of the genomic DNA to determine the presence or absence of the 
known genetic markers were developed using the BiomassPROBE"' detection 
method (primer oligo base extension) reaction. This method uses a single 

10 detection primer followed by an oligonucleotide extension step to give products, 
which can be readily resolved by mass spectrometry, and, in particular, MALDI- 
TOF mass spectrometry. The products differ in length depending on the 
presence or absence of a polymorphism. In this method, a detection primer 
anneals adjacent to the site of a variable nucleotide or sequence of nucleotides 

15 and the primer is extended using a DNA polymerase in the presence of one or 
more dideoxyNTPs and, optionally, one or more deoxyNTPs. The resulting 
products are resolved by MALDI-TOF mass spectrometry. The mass of the 
products as measured by MALDI-TOF mass spectrometry makes possible the 
determination of the nucleotide(s) present at the variable site. 

20 First, each of the Caucasian genomic DNA samples was subjected to 

nucleic acid amplification using primers corresponding to sites 5' and 3' of the 
polymorphic sites of the p21 (S31R allele), p53 (R72P allele) and Lipoprotein 
Lipase (N291S allele) genes. One primer in each primer pair was biotinylated to 
permit immobilization of the amplification product to a solid support. 

25 Specifically, the polymerase chain reaction primers used for amplification of the 
relevant segments of the p21, p53 and lipoprotein lipase genes are shown 
below: US4p21c31-2F (SEQ ID NO: 9) and US5p21-2R (SEQ ID NO: 10) for p21 
gene amplification; US4-p53-ex4-F (also shown as p53-ex4US4 (SEQ ID NO: 2)) 
and US5-p53/2-4R (also shown as US5P53/4R (SEQ ID NO: 3)) for p53 gene 

30 amplification; and US4-LPL-F2 (SEQ ID NO: 16) and US5-LPL-R2 (SEQ ID NO: 
17) for lipoprotein lipase gene amplification. 
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Amplification of the respective DNA sequences was conducted according 
to standard protocols. For example, primers may be used in a concentration of 8 
pmol. The reaction mixture (e.g., total volume 50 may contain 
Taq-polymerase including lOx buffer and dTNPs. Cycling conditions for 
5 polymerase chain reaction amplification may typically be initially 5 min. at 95 °C, 
followed by 1 min. at 94°C, 45 sec at 53°C, and 30 sec at 72°C for 40 cycles 
with a final extension time of 5 min at 72°C. Amplification products may be 
purified by using Qiagen's PCR purification kit {No. 28106) according to 
manufacturer's instructions. The elution of the purified products from the 

10 column can be done in 50 /vl TE-buffer (10mM Tris, 1 mM EDTA, pH 7.5). 

The purified amplification products were immobilized via a biotin-avidin 
linkage to streptavidin-coated beads and the double-stranded DNA was 
denatured. A detection primer was then annealed to the immobilized DNA using 
conditions such as, for example, the following: 50 //I annealing buffer (20 mM 

15 Tris, 10 mM KCI, 10 mM <NHJ 2 S0 4 , 2 mM MgSO a , 1% Triton X-100, pH 8) at 
50°C for 10 min, followed by washing of the beads three times with 200 //I 
washing buffer (40 mM Tris, 1 mM EDTA, 50 mM NaCI, 0.1 % Tween 20, pH 
8.8) and once in 200 jj\ TE buffer. 

The PROBE extension reaction was performed, for example, by using 

20 some components of the DNA sequencing kit from USB (No. 70770) and dNTPs 
or ddNTPs from Pharmacia. An exemplary protocol could include a total reaction 
volume of 45 //I, containing of 21 //I water, 6 //I Sequenase-buffer, 3 /vl 10 mM 
DTT solution, 4.5 //I, 0.5 mM of three dNTPs, 4.5 jul, 2 mM the missing one 
ddNTP, 5.5 /yl glycerol enzyme dilution buffer, 0.25 p\ Sequenase 2.0, and 0.25 

25 pyrophosphatase. The reaction can then by pipetted on ice and incubated for 1 5 
min at room temperature and for 5 min at 37°C. The beads may be washed 
three times with 200 y\ washing buffer and once with 60 |/l of a 70 mM 
NH 4 -Citrate solution. 

The DNA was denatured to release the extended primers from the 

30 immobilized template. Each of the resulting extension products was separately 
analyzed by MALDI-TOF mass spectrometry using 3-hydroxypicolinic acid (3- 
HPA) as matrix and a UV laser. 
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Specifically, the primers used in the PROBE reactions are as shown 
below: P2 1/3 1-3 (SEQ ID NO: 12) for PROBE analysis of the p21 polymorphic 
site; P53/72 (SEQ ID NO: 4) for PROBE analysis of the p53 polymorphic site; 
and LPL-2 for PROBE analysis of the lipoprotein lipase gene polymorphic site. In 
5 the PROBE analysis of the p21 polymorphic site, the extension reaction was 

performed using dideoxy-C. The products resulting from the reaction conducted 
on a "wild-type" allele template (wherein codon 31 encodes a serine) and from 
the reaction conducted on a polymorphic S31R allele template (wherein codon 
31 encodes an arginine) are shown below and designated as P2 1/3 1-3 Ser (wt) 

10 (SEQ ID NO: 13) and P2 1/3 1-3 Arg (SEQ ID NO: 14), respectively. The masses 
for each product as can be measured by MALDI-TOF mass spectrometry are also 
provided (i.e., 4900.2 Da for the wild-type product and 5213.4 Da for the 
polymorphic product). 

In the PROBE analysis of the p53 polymorphic site, the extension reaction 

15 was performed using dideoxy-C. The products resulting from the reaction 
conducted on a "wild-type" allele template (wherein codon 72 encodes an 
arginine) and from the reaction conducted on a polymorphic R72P allele template 
(wherein codon 72 encodes a proline) are shown below and designated as 
Cod72 G Arg (wt) and Cod72 C Pro, respectively. The masses for each product 

20 as can be measured by MALDI-TOF mass spectrometry are also provided (i.e., 
5734.8 Da for the wild-type product and 5405.6 Da for the polymorphic 
product). 

In the PROBE analysts of the lipoprotein lipase gene polymorphic site, the 
extension reaction was performed using a mixture of ddA and ddT. The 

25 products resulting from the reaction conducted on a "wild-type" allele template 
(wherein codon 291 encodes an asparagine) and from the reaction conducted on 
a polymorphic N291S allele template (wherein codon 291 encodes a serine) are 
shown below and designated as 291Asn and 291 Ser, respectively. The masses 
for each product as can be measured by MALDI-TOF mass spectrometry are also 

30 provided (i.e., 6438.2 Da for the wild-type product and 6758.4 Da for the 
polymorphic product). 
P53-1 (R72P) 
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PCR Product length: 407 bp (SEQ ID NO: 1) 

US4 -p53 -ex4 - F 
ctg aggacctggt cctctgactg 
ctcttttcac c catctacag tcccccttqc c otcccaaqc aat ggatqat ttqatgctgt 
5 ccccggacga tattgaacaa tggttcactg aagacccagg tccagatgaa gctcccaqaa 
P53/72 72R 
tgccaqaqqc tq ctcccc gc gtggcccctg caccagcagc tcctacaccg gcggcccctg 

c 72P 

caccagcccc ctcctggccc ctgtcatctt ctgtcccttc ccagaaaacc taccagggca 
10 gctacggttt ccgtctgggc ttcttgcatt ctgggacagc caagtctgtg acttgcacgg 
tcagttgccc tgaggggctg gcttccatga gacttcaa 

US5-p53/2-4R 

Primers (SEQ ID NOs: 2-4) 

p53-ex4FUS4 ccc aqt cac qac gtt qta aaa cq c tga gga cct ggt cct ctg ac 
15 US5P53/4R age gga taa caa ttt cac aca gg t tga agt etc atg gaa gec 
P53/72 gec aga ggc tgc tec cc 



Masses 



Allele 


Product Termination: ddC 


SEQ U 


Length 


Mass 


P53/72 


gccagaggctgctcccc 


5 


17 


5132.4 


Cod72 G Arg (wt) 


gccagaggctgctccccgc 


6 


19 


5734.8 


Cod72 C Pro 


gccagaggctgctccccc 


7 


18 


5405.6 



Biotinylated US5 primer is used in the PCR amplification. 
LPL-1 (N291S) 

25 Amino acid exchange asparagine to serine at codon 291 of the 

lipoprotein lipase gene. 

PCR Product length: 251 bp (SEQ ID NO: 15) 

US4-LPL-F2 (SEQ ID NO: 16) 

gcgctccatt catctcttca tcgactctct gttgaatgaa gaaaatccaa gtaaggecta 
30 caggtgcagt tccaaggaag cctttgagaa agggctctgc ttgagttgta gaaagaaccg 
LPL-2 2 91N 

ctqcaa caat ctqggctatq agatca ataa agtcagagee aaaagaagca gcaaaatgta 

g 291S 

cctgaagact cgttctcaga tgece 
35 US4 -LPL-R2 

Primers (SEQ ID NOs: 16-18): 

US4-LPL-F2 ccc aot cac qac gtt qta aaa cq o cgc tec att cat etc ttc 
US5-LPL-R2 age gga taa caa ttt cac aca qq q ggc ate tga gaa cga gtc 
LPL-2 caa tct ggg eta tga gat ca 



WO 01/27857 



PCT/US00/28413 



-47- 



Masses 



Allele 


Product Termination: ddA, ddT 


SEQ # 


Length 


Mass 


LPL-2 


caatctgggctatgagatca 


19 


20 


6141 


291 Asn 


caatctgggctatgagatcaa 


20 


21 


6438.2 


291 Ser 


caatctgggctatgagatcagt 


21 


22 


6758.4 



Biotinylated US5 primer is used in the PCR amplification. 
P21-1 (S31R) 

Amino acid exchange serine to arginine at codon 31 of the tumor 
10 suppressor gene p21 . Product length: 207 bp (SEQ ID NO: 8) 

US-|p21c31-2F 

gtcc gtcagaaccc atgcggcagc 
p21/31-3 31S 

aaggcctgcc gccgcctctt cggcccagtg qa cagcgagc agctgag ccg cgactgtgat 
15 a 31R 

gcgctaatgg cgggctgcat ccaggaggcc cgtgagcgat ggaacttcga ctttgtcacc 
gagacaccac tggaggg 

US5p21-2R 

Primers (SEQ ID NOs: 9-11) 
20 US4p21c31-2F ccc aqt cac qac qtt gta aaa eg g tec gtc aga acc cat gcg g 
US5p21-2R age gga taa caa ttt cac aca gq c tec agt ggt gtc teg gtg ac 
P2 1/3 1-3 cag cga gca get gag 

Masses 



Allele 


Product Termination: ddC 


SEQ # 


Length 


Mass 


P21/31-3 


cagcgagcagctgag 


12 


15 


4627 


P21/31-3 Ser (wt) 


cagcgagcagctgagc 


13 


16 


4900.2 


P2 1/3 1-3 Arg 


cagcgagcagctgagac 


14 


17 


5213.4 



Biotinylated US5 primer is used in the PCR amplification. 



30 Each of the Caucasian subject DNA samples was individually 

analyzed by MALDI-TOF mass spectrometry to determine the identity of 
the nucleotide at the polymorphic sites. The genotypic results of each 
assay can be entered into the database. The results were then sorted 
according to age and/or sex to determine the distribution of allelic 

35 frequencies by age and/or sex. As d picted in th Figures showing 
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histograms of the results, in each case, there was a differential 
distribution of the allelic frequencies of the genetic markers for the p21, 
p53 and lipoprotein lipase gene polymorphisms. 

Figure 8 shows the results of the p21 genetic marker assays 
5 reveals a statistically significant decrease (from 13.3% to 9.2%) in the 
frequency of the heterozygous genotype (S31R) in Caucasians with age 
(18-49 years of age compared to 50-79 years of age). The frequencies of 
the homozygous (S31 and R31) genotypes for the two age groups are 
also shown, as are the overall frequencies of the S31 and R31 alleles in 
10 the two age groups (designated as *S31 and *R31, respectively in the 
Figure). 

Figures 7A-C shows the results of the p53 genetic marker assays 
and reveals a statistically significant decrease (from 6.7% to 3.7%) in the 
frequency of the homozygous polymorphic genotype (P72) in Caucasians 

15 with age (18-59 years of age compared to 60-79 years of age). The 
frequencies of the homozygous "wild-type" genotype (R72) and the 
heterozygous genotype (R72P) for the two age groups are also shown, as 
are the overall frequencies of the R72 and P72 alleles in the two age 
groups (designated as "R72 and *P72, respectively in the Figure). These 

20 results are consistent with the observation that allele is not benign, as 
p53 regulates expression of a second protein, p21, which inhibits 
cyclin-dependent kinases (CDKs) needed to drive cells through the 
cell-cycle (a mutation in either gene can disrupt the cell cycle leading to 
increased cell division). 

25 Figure 2C shows the results of the lipoprotein lipase gene genetic 

marker assays reveals a statistically significant decrease (from 1.97% to 
0.54%) in the frequency of the polymorphic allele (S291) in Caucasian 
males with age (see also Reymer era/. (1995) Nature Genetics 10:28-34). 
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The frequencies of this allele in Caucasian females of different age groups 
are also shown. 

EXAMPLE 2 

This example describes the use of MALDI-TOF mass spectrometry 
5 to analyze DNA samples of a number of subjects as individual samples 
and as pooled samples of multiple subjects to assess the presence or 
absence of a polymorphic allele (the 353Q allele) of the Factor VII gene 
and determine the frequency of the allele in the group of subjects. The 
results of this study show that essentially the same allelic frequency can 
10 be obtained by analyzing pooled DNA samples as by analyzing each 
sample separately and thereby demonstrate the quantitative nature of 
MALDI-TOF mass spectrometry in the analysis of nucleic acids. 
Factor VII 

Factor VII is a serine protease involved in the extrinsic blood 

15 coagulation cascade. This factor is activated by thrombin and works with 

tissue factor (Factor III) in the processing of Factor X to Factor Xa. There 

is evidence that supports an association between polymorphisms in the 

Factor VII gene and increased Factor VII activity which can result in an 

elevated risk of ischemic cardiovascular disease, including myocardial 

20 infarction. The polymorphism investigated in this study is R353Q (i.e., a 

substitution of a glutamic acid residue for an arginine residue at codon 

353 of the Factor VII gene) (see Table 5). 

Analysis of DNA samples for the presence or absence of the 353Q 
allele of the Factor VII gene 

25 

Genomic DNA was isolated from separate blood samples obtained 
from a large number of subjects divided into multiple groups of 92 
subjects per group. Each sample of genomic DNA was analyzed using 
the BiomassPROBE™ assay as described in Example 1 to determine th 
30 presence or absence of the 353Q polymorphism of the Factor VII gene. 
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First, DNA from each sample was amplified in a polymerase chain 
reaction using primers F7-353FUS4 (SEQ ID NO: 24) and F7-353RUS5 
(SEQ ID NO: 26) as shown below and using standard conditions, for 
example, as described in Example 1 . One of the primers was biotinylated 
5 to permit immobilization of the amplification product to a solid support. 
The purified amplification products were immobilized via a biotin-avidin 
linkage to streptavidin-coated beads and the double-stranded DNA was 
denatured. A detection primer was then annealed to the immobilized 
DNA using conditions such as, for example, described in Example 1 . The 
10 detection primer is shown as F7-353-P (SEQ ID NO: 27) below. The 

PROBE extension reaction was carried out using conditions, for example, 
such as those described in Example 1. The reaction was performed using 
ddG. 

The DNA was denatured to release the extended primers from the 
15 immobilized template. Each of the resulting extension products was 

separately analyzed by MALDI-TOF mass spectrometry. A matrix such as 
3-hydroxypicolinic acid (3-HPA) and a UV laser could be used in the 
MALDI-TOF mass spectrometric analysis. The products resulting from the 
reaction conducted on a "wild-type" allele template (wherein codon 353 
20 encodes an arginine) and from the reaction conducted on a polymorphic 
353Q allele template (wherein codon 353 encodes a glutamic acid) are 
shown below and designated as 353 CGG and 353 CAG, respectively. 
The masses for each product as can be measured by MALDI-TOF mass 
spectrometry are also provided (i.e., 5646.8 Da for the wild-type product 
25 and 5960 Da for the polymorphic product). 

The MALDI-TOF mass spectrometric analyses of the PROBE 
reactions of each DNA sample were first conducted separately on each 
sample (250 nanograms total concentration of DNA per analysis). The 
allelic frequency of the 353Q polymorphism in the group of 92 subjects 
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was calculated based on the number of individual subjects in which it was 
detected. 

Next, the samples from 92 subjects were pooled (250 nanograms 
total concentration of DNA in which the concentration of any individual 
5 DNA is 2.7 nanograms) and the pool of DNA was subjected to MALDI- 
TOF mass spectrometric analysis. The area under the signal 
corresponding to the mass of the 353Q polymorphism PROBE extension 
product in the resulting spectrum was integrated in order to quantitate the 
amount of DNA present. The ratio of this amount to total DNA was used 
10 to determine the allelic frequency of the 353Q polymorphism in the group 
of subjects. This type of individual sample vs. pooled sample analysis 
was repeated for numerous different groups of 92 different samples. 

The frequencies calculated based on individual MALDI-TOF mass 
spectrometric analysis of the 92 separate samples of each group of 92 
15 are compared to those calculated based on MALDI-TOF mass 

spectrometric analysis of pools of DNA from 92 samples in Figure 9. 
These comparisons are shown as "pairs" of bar graphs in the Figure, each 
pair being labeled as a separate "pool" number, e.g., PI, PI 6, P2, etc. 
Thus, for example, for PI , the allelic frequency of the polymorphism 
20 calculated by separate analysis of each of the 92 samples was 1 1 .41% 
and the frequency calculated by analysis of a pool of all of the 92 DNA 
samples was 12.09%. 

The similarity in frequencies calculated by analyzing separate DNA 
samples individually and by pooling the DNA samples demonstrates that it 
25 is possible, through the quantitative nature of MALDI-TOF mass 

spectrometry, to analyze pooled samples and obtain accurate frequency 
determinations. The ability to analyze pooled DNA samples significantly 
reduces the time and costs involved in the use of the non-selected, 
healthy databases as described herein. It has also been shown that it is 
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possible to decrease the DNA concentration of the individual samples in a 
pooled mixture from 2.7 nanograms to 0.27 nanograms without any 
change in the quality of the spectrum or the ability to quantitate the 
amount of sample detected. 
5 Factor VII R353Q PROBE Assay 

PROBE Assay for cod353 CGOCAG (Arg>Gln), Exon 9 G > A. 
PCR fragment: 134 bp (incl. US tags; SEQ ID Nos. 22 and 23) 
Frequency of A allele: Europeans about 0.1, Japanese/Chinese about 
0.03-0.05 (Thromb. Haemost. 1995, 73:617-22; Diabetologia 1998, 
10 41:760-6): 

F7-353FUS4> 

1201 GTGC CGGCTA CTCG GATGGC AGCAAGGACT CCTG CAAGGG GGACAGTGGA 
GGCCCACATG 

F7-353-P> A <F7-353RUS5 

15 1261 CCACCCACTA CC GGGGCACG TG GTACCTGA CGGGCATCGT CA GCTGGGGC 
CAGGGCTGCG 

Primers (SEQ ID NOs : 24-2 6) Tm 9a 
F7-3 53FUS4 CCC AGT CAC GAC GTT GTA AAA CGA TGG CAG CAA GGA CTC CTG 64 °C 
F7-353-P CAC ATG CCA CCC ACT ACC 

20 F7-3 53RUS5 AGC GGA TAA CAA TTT CAC ACA GGT GAC GAT GCC CGT CAG GTA C 64 °C 

Masses 



Allele 


Product Termination: ddG 


SEQ # 


Length 


Mass 


F7-353-P 


atgccacccactacc 


27 


18 


5333.6 


353 CGG 


cacatgccacccactaccg 


28 


19 


5646.8 


353 CAG 


cacatgccacccactaccag 


29 


20 


5960 


US5-bio bio- 


agcggataacaatttcacacagg 


30 


23 


7648.6 



Conclusion 

The above examples demonstrate an effect of altered frequency of 
30 disease causing genetic factors within the general population. 

Interpretation of those results allows prediction of the medical relevance 
of polymorphic genetic alterations. In addition, conclusions can be drawn 
with regard to their penetrance, diagnostic specificity, positive predictive 
value, onset of disease, most appropriate onset of preventive strategies, 
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and the general applicability of genetic alterations identified in isolated 
populations to panmixed populations. Therefore, an age- and sex-stratified 
population-based sample bank that is ethnically homogenous is a suitable 
tool for rapid identification and validation of genetic factors regarding their 
5 potential medical utility. 

EXAMPLE 3 
MORBIDITY AND MORTALITY MARKERS 
Sample Band and Initial Screening 

Healthy samples were obtained through the blood bank of San 
10 Bernardino, CA. Donors signed prior to the blood collection a consent 
form and agreed that their blood will be used in genetic studies with 
regard to human aging. All samples were anomymized. Tracking back of 
samples is not possible. 

Isolation of DNA from blood samples of a healthy donor population 
15 Blood is obtained from a donor by venous puncture and preserved 

with 1mM EDTA pH 8.0. Ten milliliters of whole blood from each donor 
was centrifuged at 2000x g. One milliliter of the buffy coat was added to 
9 milliters of 155mM NH 4 CI, 10mM KHC0 3 , and 0.1 mM Na 2 EDTA, 
incubated 10 minutes at room temperature and centrifuged for 10 
20 minutes at 2000x g. The supernatant was removed, and the white cell 
pellet was washed in 155mM NH 4 CI, 10mM KHC0 3 , and 0.1 mM 
Na 2 EDTA and resuspended in 4.5 milliliters of 50mM Tris, 5mM EDTA, 
and 1 % SDS. Proteins were precipitated from the cell lysate by 6M 
Ammonium Acetate, pH 7.3, and separated from the nucleic acid by 
25 centrifugation 3000x g. The nucleic acid was recovered from the 

supernatant by the addition of an equal volume of 100% isopropanol and 
centrifugation at 2000x g. The dried nucleic acid pellet was hydrated in 
lOmM Tris pH 7.6 and ImM Na2EDTA and stored at 4C. 
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ln this study, samples were pooled as shown in Table 1 . Both 
parents of the blood donors were of Caucasian origin. 
Table 1 



Pool ID 


Sex 


Age-range 


# individuals 


SP1 


Female 


1 8-39 years 


276 


SP2 


Males 


1 8-39 years 


276 


SP3 


Females 


60-69 years 


184 


SP4 


Males 


60-79 years 


368 



10 More than 400 SNPs were tested using all four pools. After one test run 
34 assays were selected to be re-assayed at least once. Finally, 10 
assays showed repeatedly differences in allele frequencies of several 
percent and, therefore, fulfilled the criteria to be tested using the 
individual samples. Average allele frequency and standard deviation is 

15 tabulated in Table 2. 
Table 2 



Auiv ID 


SP1 


SP1-STD 


SP2 


SP2-STD 


SP3 


SP3-STD 


SP4 


SP4-STD 


47861 


0.457 


0.028 


0.433 


0.042 


0.384 


0.034 


0.380 


0.015 


47751 


0.276 


0.007 


0.403 


0.006 


0.428 


0.052 


0.400 


0.097 


48319 


0.676 


0.013 


0.627 


0.018 


0.755 


0.009 


0.686 


0.034 


48070 


0.581 


0.034 


0.617 


0.045 


0.561 


n.a. 


0.539 


0.032 


49807 


0.504 


0.034 


0.422 


0.020 


0.477 


0.030 


0.556 


0.005 


49534 


0.537 


0.017 


0.503 


n.a. 


0.623 


0.023 


0.535 


0.009 


49733 


0.560 


0.006 


0.527 


0.059 


0.546 


0.032 


0.436 


0.016 


49947 


0.754 


0.008 


0.763 


0.047 


0.736 


0.052 


0.689 


0.025 


50128 


0.401 


0.022 


0.363 


0.001 


0.294 


0.059 


0.345 


0.013 
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63306 


0.697 


0.012 


0.674 


0.013 


0.712 


0.017 


0.719 


0.005 



So far, 7 out of the 10 potential morbidity markers were fully 
analyzed. Additional information about genes in which these SNPs are 
5 located was gathered through publicly databases like Genbank. 
. AKAPS 

Candidate morbidity and mortality markers include housekeeping 
genes, such as genes involved in signal transduction. Among such genes 
are the A-kinase anchoring proteins (AKAPs) genes, which participate in 

10 signal transduction pathways involving protein phosphorylation. Protein 
phosphorylation is an important mechanism for enzyme regulation and the 
transduction of extracellular signals across the cell membrane in 
eukaryotic cells. A wide variety of cellular substrates, including enzymes, 
membrane receptors, ion channels and transcription factors, can be 

15 phosphorylated in response to extracellular signals that interact with cells. 
A key enzyme in the phosphorylation of cellular proteins in response to 
hormones and neurotransmitters is cyclic AMP (cAMP)-dependent protein 
kinase (PKA). Upon activation by cAMP, PKA thus mediates a variety of 
cellular responses to such extracellular signals. An array of PKA isozymes 

20 are expressed in mammalian cells. The PKAs usually exist as inactive 
tetramers containing a regulatory (R) subunit dimer and two catalytic (C) 
subunits. Genes encoding three C subunits (Cor, C/S and Cy) and four R 
subunits (Rla, Rl£, Rllor and Rlljff) have been identified (see Takio et al. 
(1982) Proc. Natl. Acad. Sci. U.S. A. 75:2544-2548; Lee eta/. (1983) 

25 Proc. Natl. Acad. ScL U.S. A. 50:3608-3612; Jahnsen et al. (1996) J. 
Biol. Chem. 257:12352-12361; Clegg et al. (1988) Proc. Natl. Acad. Sci. 
U.S. A. 55:3703-3707; and Scott (1991) Pharmacol. Ther. 50:123-145]. 
The type I (Rl) a and type II (RM) a subunits are distributed ubiquitously, 
whereas Rly? and Rll/? are present mainly in brain [see. e.g., Miki and Eddy 
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(1999) J. Biol. Chem. 274:29057-29062]. The type I PKA holoenzyme 
(Rla and Rl/?) is predominantly cytoplasmic, whereas the majority of type 
II PKA (Rlla and Rllyff) associates with cellular structures and organelles 
[Scott (1991) Pharmacol. Ther. 50:123-145], Many hormones and other 
5 signals act through receptors to generate cAMP which binds to the R 
subunits of PKA and releases and activates the C subunits to 
phosphorylate proteins. Because protein kinases and their substrates are 
widely distributed throughout cells, there are mechanisms in place in cells 
to localize protein kinase-mediated responses to different signals. One 

10 such mechanism involves subcellular targeting of PKAs through 

association with anchoring proteins, referred to as A-kinase anchoring 
proteins (AKAPs), that place PKAs in close proximity to specific 
organelles or cytoskeletal components and particular substrates thereby 
providing for more specific PKA interactions and localized responses [see, 

15 e.g., Scott etat. (1990) J. Biol. Chem. 255:21561-21566; Bregman et al. 
(1991) J. Biol. Chem. 255:7207-7213; and Miki and Eddy (1999) J. Biol. 
Chem. 274:29057-29062]. Anchoring not only places the kinase close to 
preferred substrates, but also positions the PKA holoenzyme at sites 
where it can optimally respond to fluctuations in the second messenger 

20 cAMP [Mochly-Rosen (1995) Science 255:247-251; Faux and Scott 
(1996) Trends Biochem. Sci. 27:312-315; Hubbard and Cohen (1993) 
Trends Biochem. Sci. 75:1 72-1 77J. 

Up to 75% of type II PKA is localized to various intracellular sites 
through association of the regulatory subunit (Rll) with AKAPs [see, e.g., 

25 Hausken et al. (1996) J. Biol. Chem. 277:29016-29022]. Rll subunits of 
PKA bind to AKAPs with nanomolar affinity [Carr et al. (1992) J. Biol. 
Chem. 257:13376-13382], and many AKAP-RII complexes have been 
isolated from cell extracts. Rl subunits of PKA bind to AKAPs with only 
micromolar affinity [Burton et al. (1997) Proc. Natl. Acad. Sci. U.S.A. 
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94'A 1067-1 1072]. Evidence of binding of a PKA Rl subunit to an AKAP 
has been reported [Miki and Eddy (1998) J. Biol. Chem 273:34384- 
34390] in which Rla-specific and Rla/Rlla dual specificity PKA anchoring 
domains were identified on FSC1/AKAP82. Additional dual specific 
5 AKAPs, referred to as D-AKAP1 and D-AKAP2, which interact with the 
type I and type II regulatory subunits of PKA have also been reported 
[Huang era/. (1997) J. Biol. Chem. 272:8057-8064; Huang eta/. (1997) 
Proc. Natl. Acad. Sci. U.S.A. 94:1 1 184-1 1 189]. 

More than 20 AKAPs have been reported in different tissues and 

10 species. Complementary DNAs (cDNAs) encoding AKAPs have been 
isolated from diverse species, ranging from Caenorhabditis elegans and 
Drosophilia to human [see, e.g., Colledge and Scott (1999) Trends Cell 
Biol. 5:216-221]. Regions within AKAPs that mediate association with 
Rll subunits of PKA have been identified. These regions of approximately 

15 10-18 amino acid residues vary substantially in primary sequence, but 
secondary structure predictions indicate that they are likely to form an 
amphipathic helix with hydrophobic residues aligned along one face of the 
helix and charged residues along the other [Carr et al. (1991) J. Biol. 
Chem. 266:14188-14192; Carr et al. (1992) J. Biol. Chem. 267:13376- 

20 13382]. Hydrophobic amino acids with a long aliphatic side chain, e.g., 
valine, leucine or isoleucine, may participate in binding to Rll subunits 
[Glantz etal. (1993) J. Biol. Chem. 268:1 2796-1 2804]. 

Many AKAPs also have the ability to bind to multiple proteins, 
including other signaling enzymes. For example, AKAP79 binds to PKA, 

25 protein kinase C (PKC) and the protein phosphatase calcineurin (PP2B) 
[Coghlan etal. (1995) Science 267:108-1 1 2 and Klauck etal. (1996) 
Science 277:1589-1592]. Therefore, the targeting of AKAP79 to 
neuronal postsynaptic membranes brings together enzymes with opposite 
catalytic activities in a single complex. 
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AKAPs thus serve as potential regulatory mechanisms that increase 
the selectivity and intensity of a cAMP-mediated response. There is a 
need, therefore, to identify and elucidate the structural and functional 
properties of AKAPs in order to gain a complete understanding of the 
5 important role these proteins play in the basic functioning of cells. 
AKAP10 

The sequence of a human AKAP10 cDNA (also referred to as D- 
AKAP2) is available in the GenBank database, at accession numbers 
AF037439 (SEQ ID NO: 31) and NM 007202. The AKAP10 gene is 

10 located on chromosome 17. 

The sequence of a mouse D-AKAP2 cDNA is also available in the 
GenBank database (see accession number AF021833). The mouse D- 
AKAP2 protein contains an RGS domain near the amino terminus that is 
characteristic of proteins that interact with Ga subunits and possess 

15 GTPase activating protein-like activity [Huang et al. (1997) Proc. Natl. 
Acad. Sci. U.S.A. 54:11184-11189]. The human AKAP10 protein also 
has sequences homologous to RGS domains. The carboxy-terminal 40 
residues of the mouse D-AKAP2 protein are responsible for the interaction 
with the regulatory subunits of PKA. This sequence is fairly well 

20 conserved between the mouse D-AKAP2 and human AKAP10 proteins. 

Polymorphisms of the human AKAP10 gene and polymorphic AKAP10 
proteins 

Polymorphisms of AKAP genes that alter gene expression, 
regulation, protein structure and/or protein function are more likely to 

25 have a significant effect on the regulation of enzyme (particularly PKA) 
activity, cellular transduction of signals and responses thereto and on the 
basic functioning of cells than polymorphisms that do not alter gene 
and/or protein function. Included in the polymorphic AKAPs provided 
herein are human AKAP10 proteins containing differing amino acid 

30 residues at position number 646. 
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Amino acid 646 of the human AKAP10 protein is located in the 
carboxy-terminal region of the protein within a segment that participates 
in the binding of R-subunits of PKAs. This segment includes the carboxy- 
terminal 40 amino acids. 
5 The amino acid residue reported for position 646 of the human 

AKAP10 protein is an isoleucine. Polymorphic human AKAP10 proteins 
provided herein have the amino acid sequence but contain residues other 
than isoleucine at amino acid position 646 of the protein. In particular 
embodiments of the polymorphic human AKAP10 proteins provided 
10 herein, the amino acid at position 646 is a valine, leucine or phenylalanine 
residue. 

An A to G transition at nucleotide 2073 of the human AKAP10 coding 
sequence 

As described herein, an allele of the human AKAP10 gene that 

15 contains a specific polymorphism at position 2073 of the coding 

sequence and thereby encodes a valine at position 646 has been detected 

in varying frequencies in DNA samples from younger and older segments 

of the human population. In this allele, the A at position 2073 of the 

AKAP10 gene coding sequence is changed from an A to a G, giving rise 

20 to an altered sequence in which the codon for amino acid 646 changes 

from ATT, coding for isoleucine, to GTT, coding for valine. 

Morbidity marker 1 : human protein kinase A anchoring protein 
(AKAP10-1) 

PCR Amplification and BiomassPROBE assay detection of AKAP10-1 in a 

25 healthy donor population 

PCR Amplification of donor population for AKAP 10 
PCR primers were synthesized by OPERON using phosphoramidite 
chemistry. Amplification of the AKAP10 target sequence was carried out 
in single 50//I PCR reaction with lOOng-lug of pooled human genomic 

30 DNAs in a 50//I PCR reaction. Individual DNA concentrations within the 



WO 01/27857 



PCT/US00/28413 



-60- 

pooled samples were present in equal concentration with the final 
concentration ranging from 1-25ng. Each reaction containing IX PCR 
buffer (Qiagen, Valencia, CA), 200uM dNTPs, 1U Hotstar Taq 
polymerase (Qiagen, Valencia, CA), 4mM MgCI 2 , and 25pmol of the 
5 forward primer containing the universal primer sequence and the target 
specific sequence 5'-TCTCAATCATGTGCATTGAGG-3'(SEQ ID NO: 45), 
2pmol of the reverse primer 

5'-AGCGGATAACAATTTCACACAGGGATCACACAGCCATCAGCAG-3' 
(SEQ ID NO: 46), and lOpmol of a biotinylated universal primer 

10 complementary to the 5' end of the PCR amplicon 

5'-AGCGGATAACAATTTCACACAGG-3'(SEQ ID NO: 47). After an initial 
round of amplification with the target with the specific forward and 
reverse primer, the 5' biotinylated universal primer then hybridized and 
acted as a reverse primer thereby introducing a 3' biotin capture moiety 

15 into the molecule. The amplification protocol results in a 5'-biotinylated 
double stranded DNA amplicon and dramatically reduces the cost of high 
throughput genotyping by eliminating the need to 5' biotin label each 
forward primer used in a genotyping. Thermal cycling was performed in 
0.2mL tubes or 96 well plate using an MJ Research Thermal Cycler 

20 (calculated temperature) with the following cycling parameters: 94° C for 
5 min; 45 cycles: 94° C for 20 sec, 56° C for 30 sec, 72° C for 60 sec; 
72° C 3min. 
Immobilization of DNA 

The 50/vl PCR reaction was added to 25ul of streptavidin coated magnetic 
25 bead (Dynal) prewashed three times and resuspended in 1M NH 4 CI, 

0.06M NH 4 0H. The PCR amplicons were allowed to bind to the beads for 
1 5 minutes at room temperature. The beads were then collected with a 
magnet and the supernatant containing unbound DNA was removed. The 
unbound strand was release from the double stranded amplicons by 
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incubation in 100mM NaOH and washing of the beads three times with 
10mM Tris pH 8.0. 

BiomassPROBE assay analysis of donor population for AKAP10-1 (clone 
48319) 

5 Genotyping using the BiomassPROBE assay methods was carried 

out by resuspending the DNA coated magnetic beads in 26mM Tris-HCI 
pH 9.5, 6.5 mM MgCI 2 and SOmM each of dTTP and 50mM each of 
ddCTP, ddATP, ddGTP, 2.5U of a thermostable DNA polymerase 
(Ambersham) and 20pmol of a template specific oligonucleotide PROBE 
10 primer 5'-CTGGCGCCCACGTGGTCAA-3' (SEQ ID NO: 48) (Operon). 
Primer extension occurs with three cycles of oligonucleotide primer 
hybridization and extension. The extension products were analyzed after 
denaturation from the template with 50mM NH 4 CI and transfer of 150nL 
each sample to a silicon chip preloaded with 1 50nL of H3PA matrix 
15 material. The sample material was allowed to crystallize and was 

analyzed by MALDI-TOF (Bruker, PerSeptive). The SNP that is present in 
AKAP10-1 is a T to C transversion at nucleotide number 156277 of the 
sequence of a genomic clone of the AKAP10 gene (GenBank Accession 
No. AC005730) (SEQ ID NO: 36). SEQ ID NO: 35: represents the 
20 nucleotide sequence of human chromosome 17, which contains the 

genomic nucleotide sequence of the human AKAP10 gene, and SEQ ID 
NO: represents the nucleotide sequence of human chromosome 17, which 
contains the genomic nucleotide sequence of the human AKAP10-1 
allele. The mass of the primer used in the BioMass probe reaction was 
25 5500.6 daltons. In the presence of the SNP, the primer is extended by 
the addition of ddC, which has a mass of 5773.8. The wildtype gene 
results in the addition of dT and ddG to the primer to produce an 
extension product having a mass of 6101 daltons. 
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The frequency of the SNP was measured in a population of age 
selected healthy individuals. Five hundred fifty-two (552) individuals 
between the ages of 18-39 years (276 females, 276 males) and 552 
individuals between the ages of 60-79 (184 females between the ages of 
5 60-69, 368 males between the age of 60-79) were tested for the 

presence of the polymorphism localized in the non-translated 3'region of 
AKAP 10. Differences in the frequency of this polymorphism with 
increasing age groups were observed among healthy individuals. 
Statistical analysis showed that the significance level for differences in 

10 the allelic frequency for alleles between the "younger" and the "older" 
populations was p = 0.0009 and for genotypes was p = 0.003. 
Differences between age groups are significant. For the total population 
allele significance is p = 0.0009, and genotype significance is p = 0.003. 
This marker led to the best significant result with regard to allele 

15 and genotype frequencies in the age-stratified population. Figure 19 

shows the allele and genotype frequency in both genders as well as in the 
entire population. For latter the significance for alleles was p = 0.0009 
and for genotypes was p = 0.003. The young and old populations were in 
Hardy-Weinberg equilibrium. A preferential change of one particular 

20 genotype was not seen. 

The polymorphism is localized in the non-translated 3'-region of the 
gene encoding the human protein kinase A anchoring protein (AKAP10). 
The gene is located on chromosome 17. Its structure includes 15 exons 
and 14 intervening sequences (introns). The encoded protein is 

25 responsible for the sub-cellular localization of the cAMP-dependent protein 
kinase and, therefore, plays a key role in the G-protein mediated receptor- 
signaling pathway (Huang et al. PNAS (1007) 94:11184-11189). Since 
its localization is outside the coding region, this polymorphism is most 
likely in linkage disequilibrium (LD) with other non-synonymous 
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polymorphisms that could cause amino acid substitutions and 
subsequently alter the function of the protein. Sequence comparison of 
different Genbank database entries concerning this gene revealed further 
six potential polymorphisms of which two are supposed to change the 
5 respective amino acid (see Table 3). 
Table 3 



Exon 


Codon 


Nucleotides 


Amino acid 


3 


100 


GCT>GCC 


Ala > Ala 


4 


177 


AGT>GTG 


Met>Val 


8 


424 


GGOGGC 


Gly>Gly 


10 


524 


CCG > CTG 


Pro > Leu 


12 


591 


GTG > GTC 


VaOVal 


12 


599 


CGOCGA 


Arg > Arg 



15 Morbitity marker 2: human protein kinase A anchoring protein 
(AKAP10-5) 

Discovery of AKAP10-5 Allele (SEQ ID NO: 33) 

Genomic DNA was isolated from blood (as described above) of 
seventeen (17) individuals with a genotype CC at the AKAP10-1 gene 

20 locus and a single heterozygous individual (CT) (as described). A target 
sequence in the AKAP10-1 gene which encodes the C-terminal PKA 
binding domain was amplified using the polymerase chain reaction. PCR 
primers were synthesized by OPERON using phosphoramidite chemistry. 
Amplification of the AKAP10-1 target sequence was carried out in 

25 individual 50/j\ PCR reaction with 25ng of human genomic DNA 

templates. Each reaction containing I X PCR buffer (Qiagen, Valencia, 
CA), 200/yM dNTPs, IU Hotstar Taq polymerase (Qiagen, Valencia, CA), 
4mM MgCI 2 , 25pmol of the forward primer (Exl3F) containing the 
universal primer sequence and the target specific sequence 5'-TCC CAA 

30 AGT GCT GGA ATT AC-3' (SEQ ID NO: 53), and 2pmol of the reverse 
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primer (Ex14R) 5'-GTC CAA TAT ATG CAA ACA GTT G-3' (SEQ ID NO: 
54). Thermal cycling was performed in 0.2mL tubes or 96 well plate 
using an MJ Research Thermal Cycler <MJ Research, Waltham, MA) 
(calculated temperature) with the following cycling parameters: 94° C for 
5 5 min; 45 cycles; 94° C for 20 sec, 56° C for 30 sec, 72° C for 60 sec; 
72° C 3min. After amplification the amplicons were purified using a 
chromatography (Mo Bio Laboratories (Solana Beach, CA)). 

The sequence of the 18 amplicons, representing the target region, 
was determined using a standard Sanger cycle sequencing method with 

10 25nmol of the PCR amplicon, 3.2uM DNA sequencing primer 5'-CCC ACA 
GCA GTT AAT CCT TC-3'(SEQ ID NO: 55), and chain terminating 
dRhodamine labeled 2', 3' dideoxynucleotides (PE Biosystems, Foster 
City, CA) using the following cycling parameters: 96° C for 15 seconds; 
25 cycles: 55° C for 1 5 seconds, 60° C for 4 minutes. The sequencing 

15 products precipitated by 0.3M NaOAc and ethanol. The precipitate was 
centrifuged and dried. The pellets were resuspended in deionized 
formamide and separated on a 5% polyacrylimide gel. The sequence was 
determined using the "Sequencher" software (Gene Codes, Ann Arbor, 
Ml). 

20 The sequence of all 17 of the amplicons, which are homozygous 

for the AK API 0-1 SNP of the amplicons, revealed a polymorphism at 
nucleotide position 152171 (numbering for GenBank Accession No. 
AC005730 for AKAP10 genomic clone (SEQ ID NO: 35)) with A replaced 
by G. This SNP can also be designated as located at nucleotide 2073 of 

25 a cDNA clone of the wildtype AKAP10 (GenBank Accession No. 

AF037439) (SEQ ID NO: 31). The amino acid sequence of the human 
AKAP10 protein is provided as SEQ ID NO: 32. This single nucleotide 
polymorphism was designated as AKAP10-5 (SEQ ID NO: 33) and 
result d in a substitution of a valine for an isoleucine residue at amino 
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acid position 646 of the amino acid sequence of human AKAP10 (SEQ ID 
NO: 32). 

PCR Amplification and BiomassPROBE assay detection of AKAP10-5 in a 
heafthy donor population 

5 The healthy population stratified by age is a very efficient and a 

universal screening tool for morbidity associated genes by allowing for the 

detection of changes of allelic frequencies in the young compared to the 

old population. Individual samples of this healthy population base can be 

pooled to further increase the throughput. 

10 Healthy samples were obtained through the blood bank of San 

Bernardino, CA. Both parents of the blood donors were of Caucasian 
origin. Practically a healthy subject, when human, is defined as human 
donor who passes blood bank criteria to donate blood for eventual use in 
the general population. These criteria are as follows: free of detectable 

15 viral, bacterial, mycoplasma, and parasitic infections; not anemic; and 
then further selected based upon a questionnaire regarding history (see 
Figure 3). Thus, a healthy population represents an unbiased population 
of sufficient health to donate blood according to blood bank criteria, and 
not further selected for any disease state. Typically such individuals are 

20 not taking any medications, 

PCR primers were synthesized by OPERON using phosphoramidite 
chemistry. Amplification of the AKAP10 target sequence was carried out 
in a single 50//I PCR reaction with 100ng- 1j/g of pooled human genomic 
DNAs in a 50//I PCR reaction. Individual DNA concentrations within the 

25 pooled samples were present in equal concentration with the final 

concentration ranging from 1-25ng. Each reaction contained IX PCR 
buffer (Qiagen, Valencia, CA), 200//M dNTPs, 1U Hotstar Taq polymerase 
(Qiagen, Valencia, CA), 4mM MgCI 2 , and 25pmol of the forward primer 
containing the universal primer sequence and the target specific 

30 sequence 5'-AGCGGATAACAATTTCACACAGGGAGCTAGCTTGGAAGAT 
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TGC-3' (SEQ ID NO: 41 ) # 2pmol of the reverse primer 

5'-GTCCAATATATGCAAACAGTTG-3' (SEQ ID NO: 54), and 10pmol of a 
biotinylated universal primer complementary to the 5' end of the PCR 
amplicon BIO:5'-AGCGGATAACAATTTCACACAGG-3' (SEQ ID NO: 43). 
5 After an initial round of amplification with the target with the specific 
forward and reverse primer, the 5' biotinylated universal primer can then 
be hybridized and acted as a forward primer thereby introducing a 5' 
biotin capture moiety into the molecule. The amplification protocol 
resulted in a 5'-biotinylated double stranded DNA amplicon and 
10 dramatically reduced the cost of high throughput genotyping by 

eliminating the need to 5' biotin label every forward primer used in a 
genotyping. 

Themal cycling was performed in 0.2mL tubes or 96 well plate 
using an MJ Research Thermal Cycler (calculated temperature) with the 

15 following cycling parameters: 94° C for 5 min; 45 cycles: 94° C for 20 
sec, 56° C for 30 sec; 72° C for 60 sec; 72° C 3min. 
Immobilization of DNA 

The 50 /yl PCR reaction was added to 25/yL of streptavidin coated 
magnetic beads (Dynal, Oslo, Norway), which were prewashed three 

20 times and resuspended in 1M NH 4 CI, 0.06M NH 4 0H. The 5' end of one 
strand of the double stranded PCR amplicons were allowed to bind to the 
beads for 15 minutes at room temperature. The beads were then 
collected with a magnet and the supernatant containing unbound DNA 
was removed. The hybridized but unbound strand was released from the 

25 double stranded amplicons by incubation in lOOmM NaOH and washing 
of the beads three times with 10mM Tris pH 8.0. 
Detection of AKAP10-5 using BiomassPROBE™ Assay 

BiomassPROBE™ assay of primer extension analysis (see, U.S. 
Patent No. 6,043,031) of donor population for AKAP 10-5 (SEQ ID NO: 
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33) was performed. Genotyping using these methods was carried out by 
resuspending the DNA coated magnetic beads in 26mM Tris-HCL pH 9.5, 
6.5 mM MgCI 2 , 50mM dTTP, 50mM each of ddCTP, ddATP, ddGTP, 2.5U 
of a thermostable DNA polymerase (Ambersham), and 20pmol of a 
5 template specific oligonucleotide PROBE primer 

5'-ACTGAGCCTGCTGCATAA-3' (SEQ ID NO: 44) (Operon). Primer 
extension occurs with three cycles of oligonucleotide primer with 
hybridization and extension. The extension products were analyzed after 
denaturation from the template with 50 mM NH 4 CI and transfer of 150 nL 

10 of each sample to a silicon chip preloaded with 150 nl of H3PA matrix 
material. The sample material was allowed to crystallize and analyzed by 
MALDI-TOF (Bruker, PerSeptive). The primer has a mass of 5483.6 
daltons. The SNP results in the additional of a ddC to the primer, giving a 
mass of 5756.8 daltons for the extended product. The wild type results in 

15 the addition a T and ddG to the primer giving a mass of 6101 daltons. 

The frequency of the SNP was measured in a population of age 

selected healthy individuals. Seven hundred thirteen (713) individuals 

under 40 years of age (360 females, 353 males) and 703 individuals over 

60 years of age (322 females, 381 males) were tested for the presence of 

20 the SNP, AKAP10-5 (SEQ ID NO: 33). Results are presented below in 
Table 1 . 



AKAP1C 


TABLE 1 

)-5 (2073V) frequency comparison in 2 age groups 








<40 


>60 


delta G allele 


Female 


Alleles 


*G 


38.6 


34.6 


4.0 






*A 


61.4 


65.4 






Genotypes 


G 


13.9 


1 1.8 


2.1 






GA 


49.4 


45.7 








A 


36.7 


42.5 
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Male 


Alleles 


*G 


41 .4 


37.0 


4.4 






•A 


58.6 


63.0 






Genotypes 


G 


18.4 


10.8 


7.7 






GA 


45.9 


52.5 








A 


35.7 


36.7 
















Total 


Alleles 


*G 


40.0 


35.9 


4.1 






*A 


60.0 


64.1 






Genotypes 


G 


16.1 


11.2 


4.9 






GA 


47.7 


49.4 








A 


36.2 


39.4 





Figure 20 graphically shows these results of allele and genotype 
15 distribution in the age and sex stratified Caucasian population. 

Morbidity marker 3: human methionine sulfoxide reductase A (msrA) 

The age-related allele and genotype frequency of this marker in 
both genders and the entire population is shown in Figure 21 . The 
decrease of the homozygous CC genotype in the older male population is 
20 highly significant. 

Methionine sulfoxide reductase A (#63306) 

PCR Amplification and BiomassPROBE assay detection of the human 
methioine sulfoxid reductase A (h-msr-A) in a healthy donor population 
PCR Amplification of donor population for h-msr-A 
25 PCR primers were synthesized by OPERON using phosphoramidite 

chemistry. Amplification of the AKAP10 target sequence was carried out 
in single 50//I PCR reaction with 100ng-1ug of pooled human genomic 
DNA templates in a 50/y| PCR reaction. Individual DNA concentrations 
within the pooled samples were present in an equal concentration with 
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the final concentration ranging from 1-25ng. Each reaction containing I X 
PCR buffer (Qiagen, Valencia, CA), 200//M dNTPs, 1U Hotstar Taq 
polymerase (Qiagen, Valencia, CA), 4mM MgCI 2 , 25pmol of the forward 
primer containing the universal primer sequence and the target specific 
5 sequence 5'-TTTCTCTGCACAGAGAGGC-3' (SEQ ID NO: 49), 2pmol of 
the reverse primer 

5'-AGCGGATAACAATTTCACACAGGGCTGAAATCCTTCGCTTTACC-3' 
(SEQ ID NO: 50), and 10pmol of a biotinylated universal primer 
complementary to the 5' end of the PCR amplicon 
10 5'-AGCGGATAACAATTTCACACAGG-3' (SEQ ID NO: 51). After an initial 
round of amplification of the target with the specific forward and reverse 
primers, the 5' biotinylated universal primer was then hybridized and 
acted as a reverse primer thereby introducing a 3' biotin capture moiety 
into the molecule. The amplification protocol results in a 5'-biotinylated 
15 double stranded DNA amplicon and and dramatically reduces the cost of 
high throughput genotyping by eliminating the need to 5' biotin label each 
forward primer used in a genotyping. Thermal cycling was performed in 
0.2mL tubes or 96 well plate using an MJ Research Thermal Cycler 
(calculated temperature) with the following cycling parameters: 94° C for 
20 5 min; 45 cycles: 94° C for 20 sec, 56° C for 30 sec, 72° C for 60 sec; 
72° C 3min. 
Immobilization of DNA 

The 50//I PCR reaction was added to 25ul of streptavidin coated 
magnetic bead (Dynal) prewashed three times and resuspended in 1 M 
25 NH 4 CI, 0.06M NH 4 OH. The PCR amplicons were allowed to bind to the 
beads for 1 5 minutes at room temperature. The beads were then 
collected with a magnet and the supernatant containing unbound DNA 
was removed. The unbound strand was release from the double stranded 
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amplicons by incubation in lOOmM NaOH and washing of the beads three 
times with 10mM Tris pH 8.0. 

BiomassPROBE assay analysis of donor population for h-msr A 

Genotyping using the BiomassPROBE assay methods was carried 
5 out by resuspending the he DNA coated magnetic beads in 26mM 
Tris-HCI pH 9.5, 6.5 mM MgCI 2 , 50mM of dTTPs and 50mM each of 
ddCTP, ddATP, ddGTP, 2.5U of a thermostable DNA polymerase 
(Ambersham), and 20pmol of a template specific oligonucleotide PROBE 
primer 5'-CTGAAAAGGGAGAGAAAG-3' (Operon) (SEQ ID NO: 52). 

10 Primer extension occurs with three cycles of oligonucleotide primer with 
hybridization and extension. The extension products were analyzed after 
denaturation from the template with 50mM NH 4 CI and transfer of 150nl 
each sample to a silicon chip preloaded with 150nl of H3PA matrix 
material. The sample material was allowed to crystallize and analyzed by 

15 MALDI-TOF (Bruker, PerSeptive). The SNP is represented as a T to C 

tranversion in the sequence of two ESTs. The wild type is represented by 
having a T at position 128 of GenBank Accession No. AW 195104, 
which represents the nucleotide sequence of an EST which is a portion of 
the wild type human msrA gene (SEQ ID NO: 39 ). The SNP is presented 

20 as a C at position 129 of GenBank Accession No. AW 874187, which 
represents the nucleotide sequence of an EST which is a portion of an 
allele of the human msrA gene (SEQ ID NO: 40 ). 

In a genomic sequence the SNP is represented as an A to G 
transversion. The primer utilized in the BioMass probe reaction had a 

25 mass of 5654.8 daltons. In the presence of the SNP the primer is 

extended by the incorporation of a ddC and has a mass of 5928. In the 
presence of the wildtype the primer is extended by adding a dT and a 
DDC to produce a mass of 6232.1 daltons. 



WO 01/27857 



PCT/US00/28413 



-71- 

The frequency of the SNP was measured in a population of age 
selected healthy individuals. Five hundred fifty-two (552) individuals 

between the ages of 18-39 years (276 females, 276 males and 552 

i 

individuals between the age of 60-79 (184 females between the ages of 
5 60-69, 368 males between the age of 60-79) were tested for the 

presence of the polymorphism localized in the nontranslated 3'region of 
h-msr-A. 

Genotype difference between male age group among healthy 
individuals is significant. For the male population allele significance is 

10 p = 0.0009 and genotype significance is p = 0.003. The age-related allele 
and genotype frequency of this marker in both genders and the entire 
population is shown in Figure 21 . The decrease of the homozygous CC 
genotype in the older male population is highly significant. 

The polymorphism is localized in the non-translated 3'-region of the 

15 gene encoding the human methionine sulfoxide reductase (h-msrA). The 
exact localization is 451 base pairs downstream the stop codon (TAA). It 
is very likely that this SNP is in linkage disequilibrium (LD) with another 
polymorphism more upstream in the coding or promoter region; thus, it is 
not directly cause morbidity. The enzyme methionine sulfoxide reductase 

20 has been proposed to exhibit multiple biological functions. It may serve 
to repair oxidative protein damage but also play an important role in the 
regulation of proteins by activation or inactivation of their biological 
functions (Moskovitz et al. (1990) PNAS 95:14071-14075). It has also 
been shown that its activity is significantly reduced in brain tissues of 

25 Alzheimer patients (Gabbita et aL, (1999) J. Neurochem 73:1660-1666). 
It is scientifically conceivable that proteins involved in the metabolism of 
reactive oxygen species are associated to disease. 
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CONCLUSION 

The use of the healthy population provides for the identification of 
morbidity markers. The identification of proteins involved in the G-protein 
coupled signaling transduction pathway or in the detoxification of 
5 oxidative stress can be considered as convincing results. Further 
confirmation and validation of other potential polymorphisms already 
identified in silico in the gene encoding the human protein kinase A 
anchoring protein could even provide stronger association to morbidity 
and demonstrate that this gene product is a suitable pharmaceutical or 
10 diagnostic target. 

EXAMPLE 4 
MALDI-TOF Mass Spectrometry Analysis 

All of the products of the enzyme assays listed below were 
analyzed by MALDI-TOF mass spectrometry. A diluted matrix solution 

15 (0.15/yL) containing of 10:1 3-hydroxypicolinic acidrammonium citrate in 
1:1 watenacetonitrile diluted 2.5-fold with water was pipetted onto a 
SpectroChip (Sequenom, Inc.) and was allowed to crystallize. Then, 
0.15//L of sample was added. A linear PerSeptive Voyager DE mass 
spectrometer or Bruker Biflex MALDI-TOF mass spectrometer, operating in 

20 positive ion mode, was used for the measurements. The sample plates 
were kept at 18.2 kV for 400 nm after each UV laser shot (approximate 
250 laser shots total), and then the target voltage was raised to 20 kV. 
The original spectra were digitized at 500 MHz. 

EXAMPLE 5 

25 Sample Conditioning 

Where indicated in the examples below, the products of the 
enzymatic digestions were purified with ZipTips (Millipore, Bedford, MA). 
The ZipTips were pre-wetted with 10 //L 50% acetonitrile and equilibrated 
4 times with 10 //I 0.1 M TEAAc. The oligonucleotide fragments w re 
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bound to the CI 8 in the ZipTip material by continuous aspiration and 
dispension of each sample into the ZipTip. Each digested oligonucleotide 
was conditioned by washing with 10 //L 0.1 M TEAAc, followed by 4 
washing steps with 10 jjL H 2 0. DNA fragments were eluted from the 
5 Ziptip with 7 //L 50% acetonitrile. 

Any method for condition the samples may be employed. Methods 
for conditioning, which generally is used to increase peak resolution, are 
well known (see, e.g., International PCT application No. WO 98/20019). 

EXAMPLE 6 

10 DNA Glycosylase-Mediated Sequence Analysis 

DNA Glycosylases modifies DNA at each position that a specific 
nucleobase resides in the DNA, thereby producing abasic sites. In a 
subsequent reaction with another enzyme, a chemical, or heat, the 
phosphate backbone at each abasic site can be cleaved. 

15 The glycosylase utilized in the following procedures was uracil-DNA 

glycosylase (UDG). Uracil bases were incorporated into DNA fragments in 
each position that a thymine base would normally occupy by amplifying a 
DNA target sequence in the presence of uracil. Each uracil substituted 
DNA amplicon was incubated with UDG, which cleaved each uracil base 

20 in the amplicon, and was then subjected to conditions that effected 

backbone cleavage at each abasic site, which produced DNA fragments. 
DNA fragments were subjected to MALDI-TOF mass spectrometry 
analysis. Genetic variability in the target DNA was then assessed by 
analyzing mass spectra. 

25 Glycosylases specific for nucleotide analogs or modified 

nucleotides, as described herein, can be substituted for UDG in the 
following procedures. The glycosylase methods described hereafter, in 
conjunction with phosphate backbone cleavage and MALDI, can be used 
to analyze DNA fragments for the purposes of SNP scanning, bacteria 
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typing, methylation analysis, microsatellite analysis, genotyping, and 
nucleotide sequencing and re-sequencing. 
A. Genotyping 

A glycosylase procedure was used to genotype the DNA sequence 
5 encoding UCP-2 (Uncoupling Protein 2). The sequence for UCP-2 is 

deposited in GenBank under accession number AF096289. The sequence 
variation genotyped in the following procedure was a cytosine (C-allele) to 
thymine (T-allele) variation at nucleotide position 4790, which results in a 
alanine to valine mutation at position 55 in the UCP-2 polypeptide. 
10 DNA was amplified using a PGR procedure with a 50 jjL reaction 

volume containing of 5 pmol biotinylated primer having the sequence 5'- 
TGCTTATCCCTGTAGCTACCCTGTCTTGGCCTTGCAGATCCAA-3' (SEQ 
ID NO: 91), 15 pmol non-biotinylated primer having the sequence 5'- 
AGCGGATAACAATTTCACACAGGCCATCACACCGCGGTACTG-3' (SEQ 
15 ID NO: 92), 200 //M dATP, 200 jjM dCTP, 200 //M dGTP, 600 //M % dUTP 
(to fully replace dTTP), 1.5 mM to 3 mM MgCI 2 , 1 U of HotStarTaq 
polymerase, and 25 ng of CEPH DNA. Amplification was effected with 
45 cycles at an annealing temperature of 56°C. 

The amplification product was then immobilized onto a solid 
20 support by incubating 50 /jL of the amplification reaction with 5 //L of 
prewashed Dynabeads for 20 minutes at room temperature. The 
supernatant was removed, and the beads were incubated with 50 jjL of 
0.1 M NaOH for 5 minutes at room temperature to denature the double- 
stranded PCR product in such a fashion that single-stranded DNA was 
25 linked to the beads. The beads were then neutralized by three washes 

with 50 //L 10 mM TrisHCI (pH 8). The beads were resuspended in 10 jjL 
of a 60mM TrisHCI/1mM EDTA (pH 7.9) solution, and 1 U uracil DNA 
glycosylase was added to the solution for 45 minutes at 37 °C to remove 
uracil nucleotides present in the singl -stranded DNA linked to the beads. 
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The beads were then washed two times with 25 jjL of 10 mM TrisHCI 
(pH 8) and once with 10//L of water. The biotinylated strands were then 
eluted from the beads with 12 /jL of 2 M NH 4 OH at 60°C for 10 minutes. 
The backbone of the DNA was cleaved by incubating the samples for 10 
5 min at 95 °C (with a closed lid), and ammonia was evaporated from the 
samples by incubating the samples for 1 1 min at 80°C. 

The cleavage fragments were then analyzed by MALDI-TOF mass 
spectrometry as described in Example 4. The T-allele generated a unique 
fragment of 3254 Daltons. The C-allele generated a unique fragment of 
10 4788 Daltons. These fragements were distinguishable in mass spectra. 
Thus, the above-identified procedure was successfully utilized to 
genotype individuals heterozygous for the C-allele and T-allele in UCP-2. 
B. Glycosylase Analysis Utilizing Pooled DNA Samples 

The glycosylase assay was conducted using pooled samples to 
15 detect genetic variability at the UCP-2 locus. DNA of known genotype 
was pooled from eleven individuals and was diluted to a fixed 
concentration of 5 ng///L. The procedure provided in Example 3A was 
followed using 2 pmol of forward primer having a sequence of 5'- 
CCCAGTCACGACGTTGTAAAACGTCTTGGCCTTGCAGATCCAAG- 3' 
20 (SEQ ID NO: 93) and 15 pmol of reverse primer having the sequence 5'- 
AGCGGATAACAATTTCACACAGGCCATCACACCGCGGTACTG-3' (SEQ 
ID NO: 94). In addition, 5 pmol of biotinylated primer having the 
sequence 5'bioCCCAGTCACGACGTTGTAAAACG 3' (SEQ ID NO: 97) 
may be introduced to the PCR reaction after about two cycles. The 
25 fragments were analyzed via MALDI-TOF mass spectroscopy (Example 4). 
As determined in Example 3A, the T-allele, which generated a unique 
fragment of 3254 Daltons, could be distinguished in mass spectra from 
the C-allele, which generated a unique fragment of 4788 Daltons. Allelic 
frequency in the pooled samples was quantified by integrating the area 
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under each signal corresponding to an allelic fragment. Integration was 
accomplished by hand calculations using equations well known to those 
skilled in the art. In the pool of eleven samples, this procedure suggested 
that 40.9% of the individuals harbored the T allele and 59.09% of the 
5 individuals harbored the C allele. 

C. Glycosylase-Mediated Microsatellite Analysis 

A glycosylase procedure was utilized to identify microsatellites of 
the Bradykinin Receptor 2 (BKR-2) sequence. The sequence for BKR-2 is 
deposited in GenBank under accession number X86173. BKR-2 includes 

10 a SNP in the promoter region, which is a C to T variation, as well as a 
SNP in a repeated unit, which is a G to T variation. The procedure 
provided in Example 3A was utilized to identify the SNP in the promotor 
region, the SNP in the microsattelite repeat region, and the number of 
repeated units in the microsattelite region of BKR-2. Specifically, a 

15 forward PCR primer having the sequence 5'- 

CTCCAGCTGGGCAGGAGTGC-3' (SEQ ID NO: 95) and a reverse primer 
having the sequence 5'-CACTTCAGTCGCTCCCT-3' (SEQ ID NO: 96) 
were utilized to amplify BKR-2 DNA in the presence of uracil. The 
amplicon was fragmented by UDG followed by backbone cleavage. The 

20 cleavage fragments were analyzed by MALDI-TOF mass spectrometry as 
described in Example 4. 

With regard to the SNP in the BKR-2 promotor region having a C to 
T variation, the C-allele generated a unique fragment having a mass of 
7342.4 Daltons and the T-allele generated a unique fragment having a 

25 mass of 7053.2 Daltons. These fragments were distinguishable in mass 
spectra. Thus, the above-identified procedure was successfully utilized to 
genotype individuals heterozygous for the C-allele and T-allele in the 
promotor region of BKR-2. 
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With regard to the SNP in the BKR-2 repeat region having a G to T 
variation, the T-allele generated a unique fragment having a mass of 1784 
Daltons, which was readily detected in a mass spectrum. Hence, the 
presence of the T-allele was indicative of the G to T sequence variation in 
5 the repeat region of BKR-2. 

In addition, the number of repeat regions was distinguished 
between individuals having two repeat sequences and individuals having 
three repeat sequences in BKR-2. The DNA of these individuals did not 
harbor the G to T sequence variation in the repeat sequence as each 

10 repeat sequence contained a G at the SNP locus. The number of repeat 
regions was determined in individual samples by calculating the area 
under a signal corresponding to a unique DNA fragment having a mass of 
2771.6 Daltons. This signal in spectra generated from individuals having 
two repeat regions had an area that was thirty-three percent less than the 

15 area under the same signal in spectra generated from individuals having 
three repeat regions. Thus, the procedures discussed above can be 
utilized to genotype individuals for the number of repeat sequences 
present in BKR-2. 

D. Bisulfite Treatment Coupled with Glycosylase Digestion 
20 Bisulfite treatment of genomic DNA can be utilized to analyze 

positions of methylated cytosine residues within the DNA. Treating 
nucleic acids with bisulfite deaminates cytosine residues to uracil 
residues, while methylated cytosine remains unmodified. Thus, by 
comparing the sequence of a PCR product generated from genomic DNA 
25 that is not treated with bisulfite with the sequence of a PCR product 

generated from genomic DNA that is treated with bisulfite, the degree of 
methylation in a nucleic acid as well as the positions where cytosine is 
methylated can be deduced. 
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Genomic DNA (2 jjg) was digested by incubation with 1 //L of a 
restriction enzyme at 37 °C for 2 hours. An aliquot of 3 M NaOH was 
added to yield a final concentration of 0.3M NaOH in the digestion 
solution. The reaction was incubated at 37°C for 15 minutes followed by 
5 treatment with 5.35M urea, 4.44M bisulfite, and 10mM hydroquinone, 
where the final concentration of hydroquinone is 0.5 mM. 

The sample that was treated with bisulfite (sample A) was 
compared to the same digestion sample that had not undergone bisulfite 
treatment (sample B). After sample A was treated with bisulfite as 
10 described above, sample A and sample B were amplified by a standard 
PCR procedure. The PCR procedure included the step of overlaying each 
sample with mineral oil and then subjecting the sample to thermocycling 
(20 cycles of 15 minutes at 55°C followed by 30 seconds at 95°C). The 
PCR reaction contained four nucleotide bases, C, A, G, and U. The 
15 mineral oil was removed from each sample, and the PCR products were 
purified with glassmilk. Sodium iodide (3 volumes) and glassmilk (5 //L) 
were added to samples A and B. The samples were then placed on ice 
for 8 minutes, washed with 420 //L cold buffer, centrifuged for 10 
seconds, and the supernatant fractions were removed. This process was 
20 repeated twice and then 25 //L of water was added. Samples were 

incubated for 5 minutes at 37 °C, were centrifuged for 20 seconds, and 
the supernatant fraction was collected, and then this 
incubation/centrifugation/supernatant fraction collection procedure was 
repeated. 50|/L 0.1 M NaOH was then added to the samples to denature 
25 the DNA. The samples were incubated at room temperature for 5 

minutes, washed three times with 50 jjL of 10 mM TrisHCI (pH 8), and 
resuspended in 10 //L 60mM TrisHCI/ 1mM EDTA, pH 7.9. 

The sequence of PCR products from sample A and sample B wer 
then treated with 2U of UDG (MBI Fermentas) and then subjected to 
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backbone cleavage, as described herein. The resulting fragments from 
each of sample A and sample B were analyzed by MALDI-TOF mass 
spectroscopy as described in Example 4. Sample A gave rise to a greater 
number of fragments than the number of fragments arising from sample 
5 B, indicative that the nucleic acid harbored at least one methylated 
cytosine moiety. 

EXAMPLE 7 
Fen-Ligase-Mediated Haplotyping 

Haplotyping procedures permit the selection of a fragment from one of an 

10 individual's two homologous chromosomes and to genotype linked SNPs 
on that fragment. The direct resolution of haplotypes can yield increased 
information content, improving the diagnosis of any linked disease genes 
or identifying linkages associated with those diseases. In previous 
studies, haplotypes were typically reconstructed indirectly through 

15 pedigree analysis (in cases where pedigrees were available) through 
laborious and unreliable alleie-specific PCR or through single-molecule 
dilution methods well known in the art. 

A haplotyping procedure was used to determine the presence of 
two SNPs, referred to as SNP1 and SNP2, located on one strand in a DNA 

20 sample. The haplotyping procedure used in this assay utilized Fen-1, a 
site-specific "flap" endonuclease that cleaves DNA "flaps" created by the 
overlap of two oligonucleotides hybridized to a target DNA strand. The 
two overlapping oligonucleotides in this example were short arm and long 
arm allele-specific adaptors. The target DNA was an amplified nucleic 

25 acid that had been denatured and contained SNP1 and SNP2. 

The short arm adaptor included a unique sequence not found in the 
target DNA. The 3' distal nucleotide of the short arm adaptor was 
identical to one of the SNP1 alleles. Moreover, the long arm adaptor 
included two regions: a 3' region complementary to the short arm and a 
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5'gene-specific region complementary to the fragment of interest adjacent 
to the SNP. If there was a match between the adaptor and one of the 
homologues, the Fen enzyme recognized and cleaved the overlapping 
flap. The short arm of the adaptor was then ligated to the remainder of 
5 the target fragment (minus the SNP site). This ligated fragment was used 
as the forward primer for a second PCR reaction in which only the ligated 
homologue was amplified. The second PCR product (PCR2) was then 
analyzed by mass spectrometry. If there was no match between the 
adaptors and the target DNA, there was no overlap, no cleavage by Fen- 
10 1, and thus no PCR2 product of interest. 

If there was more than one SNP in the sequence of interest, the 
second SNP (SNP2) was found by using an adaptor that was specific for 
SNP2 and hybridizing the adaptor to the PCR2 product containing the first 
SNP. The Fen-ligase and amplification procedures were repeated for the 
15 PCR2 product containing the first SNP. If the amplified product yielded a 
second SNP, then SNP1 and SNP2 were on the same fragment. 

If the SNP is unknown, then four allele-specific adaptors (e.g. C, G, 
A, and T) can be used to hybridize with the target DNA. The substrates 
are then treated with the Fen-ligase protocol, including amplification. The 
20 PCR2 products may be analyzed by PROBE, as described herein, to 

determine which adaptors were hybridized to the DNA target and thus 
identify the SNPs in the sequence. 

A Fen-ligase assay was used to detect two SNPs present in Factor 
VII. These SNPs are located 814 base pairs apart from each other. SNP1 
25 was located at position 8401 (C to T), and SNP2 was located at 9215 (G 
to A) (SEQ ID ft). 
A. First Amplification Step 

A PCR product (PCR1) was g nerated for a known heterozygous 
individual at SNP1, a short distance from the 5' end of the SNP. 
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Specifically, a 10 //L PCR reaction was performed by mixing 1.5 mM 
MgCI 2 , 200 jjM of each dNTP, 0.5 U HotStar polymerase, 0.1//M of a 
forward primer having the sequence 5'-GCG CTC CTG TCG GTG CCA 
(SEQ ID NO: 56), 0.1//M of a reverse primer having the sequence 5'-GCC 
5 TGA CTG GTG GGG CCC (SEQ ID NO: 57), and 1 ng of genomic DNA. 
The annealing temperature was 58 °C, and the amplification process 
yielded fragments that were 861 bp in length. 

The PCR1 reaction mixture was divided in half and was treated 
with an exonuclease 1/SAP mixture (0.22 a/L mixture/5 //L PCR1 reaction) 
10 which contained 1 .0/yL SAP and 0.1 //L exonl. The exonuclease 
treatment was done for 30 minutes at 37 °C and then 20 minutes at 
85°C to denature the DNA. 

B. Adaptor Oligonucleotides 

A solution of allele-specific adaptors (C and T), containing of one 
15 long and one short oligonucleotide per adaptor, was prepared. The long 
arm and short arm oligonucleotides of each adaptor (10/yM) were mixed in 
a 1:1 ratio and heated for 30 seconds at 95°C. The temperature was 
reduced in 2°C increments to 37°C for annealing. The C-adaptor had a 
short arm sequence of 5'-CAT GCA TGC ACG GTC (SEQ ID NO: 58) and 
20 a long arm sequence of 5'-CAG AGA GTA CCC CTC GAC CGT GCA TGC 
ATG (SEQ ID NO: 59). Hence, the long arm of the adaptor was 30 bp 
(15 bp gene-specific), and the short arm was 15bp. The T-adaptor had a 
short arm sequence of 5'-CAT GCA TGC ACG GTT (SEQ ID NO: 60) and 
a long arm sequence of 5'-GTA CGT ACG TGC CAA CTC CCC ATG AGA 
25 GAC (SEQ ID NO: 61). The adaptor could also have a hairpin structure in 
which the short and long arm are separated by a loop containing of 3 to 
10 nucleotides (SEQ ID NO: 118). 

C. FEN-ligase reaction 
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ln two tubes (one tube for each allele-specific adaptor per sample) 
was placed a solution (Solution A) containing of 3.5 //I 10 mM 
16%PEG/50 mM MOPS, 1.2^1 25 mM MgCI 2 , 1 .5 jj\ 10X Ampligase 
Buffer, and 2.5 /y| PCR1. Each tube containing Solution A was incubated 
5 at 95 °C for 5 minutes to denature the PCR1 product. A second solution 
(Solution B) containing of 1 .65 jj\ Ampligase (Thermostable ligase, 
Epicentre Technologies), 1.65 //I 200ng///l MFEN (from Methanocuccus 
jannaschii), and 3.0 jj\ of an allele specific adaptor (C or T) was prepared. 
Thus, different variations of Solution B, each variation containing of 
10 different allele-specific adaptors, were made. Solution B was added to 
Solution A at 95°C and incubated at 55°C for 3 hours. The total 
reaction volume was 15.0 //I per adaptor-specific reaction. For a bi-allelic 
system, 2 x 15.0 pi reactions were required. 

The Fen-ligase reaction in each tube was then deactivated by 
15 adding 8.0 //I 10 mM EDTA. Then, 1.0 /yl exolll/Buffer (70%/30%) 

solution was added to each sample and incubated 30 minutes at 37°C, 
20 minutes at 70°C (to deactivate exolll), and 5 minutes at 95 °C (to 
denature the sample and dissociate unused adaptor from template). The 
samples were cooled in an ice slurry and purified on UltraClean PCR 
20 Clean-up (MoBio) spin columns which removed all fragments less than 
100 base pairs in length. The fragments were eluted with 50 /yl H 2 0. 
D. Second Amplification Step 

A second amplification reaction (PCR2) was conducted in each 
sample tube using the short arm adaptor (C or T) sequence as the forward 
25 primer (minus the SNP1 site). Only the ligated homologue was amplified. 
A standard PCR reaction was conducted with a total volume of 10.0 //I 
containing of 1X Buffer (final concentration), 1.5 mM final concentration 
MgCI 2 , 200 /jM final concentration dNTPs, 0.5 U HotStar polymerase, 0.1 
l/M final concentration forward primer 5'-CAT GCA TGC ACG GT (SEQ ID 



-DOC'D <WO 0127P57A? i > 



WO 01/27857 PCT/USOO/28413 



-83- 

NO: 62), 0.1/vM final concentration reverse primer 5'-GCC TGA CTG GTG 
GGG CCC (SEQ ID NO: 63), and 1 .0 /y| of the purified FEN-ligase reaction 
solution. The annealing temperature was 58°C. The PCR2 product was 
analyzed by MALDI TOF mass spectroscopy as described in Example 4. 
5 The mass spectrum of Fen SNP1 showed a mass of 6084.08 Daltons, 
representing the C allele. 
E. Genotyping Additional SNPs 

The second SNP (SNP2) can be found by using an adaptor that is 
specific for SNP2 and hybridizing that adaptor to the PCR2 product 
10 containing the first SNP. The Fen-ligase and amplification procedures are 
repeated for the PCR2 product containing the first SNP. If the amplified 
product yields a second SNP, then SN1 and SN2 are on the same 
fragment. The mass spectrum of SNP2, representing the T allele, 
showed a mass of 6359.88 Daltons. 
15 This assay can also be performed upon pooled DNA to yield 

haplotype frequencies as described herein. The Fen-ligase assay can be 
used to analyze multiplexes as described herein. 

EXAMPLE 8 
Nickase-Mediated Sequence Analysis 
20 A DNA nickase, or DNase, was used to recognize and cleave one strand 
of a DNA duplex. Two nickases usd were NY2A nickase and NYS1 
nickase (Megabase) which cleave DNA at the following sites: 
NY2A: 5'...R AG...3' 

3'...YiTC...5' where R = A or G and Y = C or T 
25 NYS1: 5'... i CC[A/G/T]...3' 

3'... GG[T/C/A]...5\ 
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A. Nickase Digestion 

Tris-HCI (10 mM), KCI (10 mM, pH 8.3), magnesium acetate (25 
mM), BSA (1 mg/mL), and 6 U of Cvi NY2A or Cvi NYS1 Nickase 
(Megabase Research) were added to 25 pmoi of double-stranded 
5 oligonucleotide template having a sequence of 5'-CGC AGG GTT TCC 
TCG TCG CAC TGG GCA TGT G-3' (SEQ ID NO: 90, Operon, Alameda, 
CA) synthesized using standard phosphoramidite chemistry . With a total 
volume of 20/yL, the reaction mixture was incubated at 37 °C for 5 hours, 
and the digestion products were purified using ZipTips (Millipore, Bedford, 

10 MA) as described in Example 5. The samples were analyzed by MALTY- 
TOM mass spectroscopy as described in Example 1 . The nickase Cvi 
NY2A yielded three fragments with masses 4049.76 Daltons, 5473.14 
Daltons, and 9540.71 Daltons. The Cvi NYS1 nickase yielded fragments 
with masses 2063.18 Daltons, 3056.48 Daltons, 6492.81 Daltons, and 

15 7450.14 Daltons. 

B. Nickase Digestion of Pooled Samples 

DQA (HLA Classll-DQ Alpha, expected fragment size = 225bp) was 
amplified from the genomic DNA of 100 healthy individuals. DQA was 
amplified using standard PCR chemistry in a reaction having a total 

20 volume of 50 //L containing of 10 mM Tris-HCI, 10 mM KCI (pH 8.3), 2.5 
mM MgCI 2 , 200 jjM of each dNTP, 10 pmol of a forward primer having 
the sequence 5'-GTG CTG CAG GTG TAA ACT TGT ACC AG-3'(SEQ ID 
NO: 64), 10 pmol of a reverse primer having the sequence 5'-CAC GGA 
TCC GGT AGC AGC GGT AGA GTT G-3'(SEQ ID NO: 65), 1 U DNA 

25 polymerase (Stoffel fragment, Perkin Elme r), and 200ng human genomic 
DNA (2ng DNA/individual). The template was denatured at 94°C for 5 
minutes. Thermal cycling was continued with a touch-down program that 
included 45 cycles of 20 seconds at 94°C, 30 seconds at 56°C, 1 



WO 01/27857 



PCT/US00/28413 



-85- 

minute at 72°C, and a final extension of 3 minutes at 72°C. The crude 
PCR product was used in the subsequent nickase reaction. 

The unpurified PCR product was subjected to nickase digestion. 
Tris-HCI (10 mM), KCI (10 mM, pH 8.3), magnesium acetate (25mM), 
5 BSA (1 mg/mL), and 5 U of Cvi NY2A or Cvi NYS1 Nickase (Megabase 
Research) were added to 25 pmol of the amplified template with a total 
reaction volume of 20//L. The mixture was then incubated at 37 °C for 5 
hours. The digestion products were purified with either ZipTips (Millipore, 
Bedford, MA) as described in Example 5. The samples were analyzed by 
10 MALDI-TOF mass spectroscopy as described in Example 4. This assay 
can also be used to do multiplexing and standardless genotyping as 
described herein. 

To simplify the nickase mass spectrum, the two complementary 
strands can be separated after digestion by using a single-stranded 
15 undigested PCR product as a capture probe. This probe (preparation 

shown below in Example 8C) can be hybridized to the nickase fragments 
in hybridization buffer containing 200 mM sodium citrate and 1 % blocking 
reagent (Boehringer Mannheim). The reaction is heated to 95 °C for 5 
minutes and cooled to room temperature over 30 minutes by using a 
20 thermal cycler (PTC-200 DNA engine, MJ Research, Waltham, MA). The 
capture probe-nickase fragment is immobilized on 140 //g of streptavidin- 
coated magnetic beads. The beads are subsequently washed three times 
with 70 mM ammonium citrate. The captured single-stranded nickase 
fragments are eluted by heating to 80°C for 5 minutes in 5 jjL of 50 mM 
25 ammonium hydroxide. 

C. Preparation of Capture Probe 

The capture probe is prepared by amplifying the human ^-globin 
gene (3' end of intron 1 to 5' end of exon 2) via PCR methods in a total 
volume of 50 /iL containing of GeneAmp 1XPCR Buffer II, 10 mM Tris- 
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HCI, pH 8.3, 50 mM KCI, 2 mM MgCI 2 , 0.2 mM dNTP mix, 10pmol of 
each primer (forward primer 5'-ACTGGGCATGTGGAGACAG-3'(SEQ ID 
NO: 66) and biotinylated reverse primer bio5'-GCACTTTCTTGCCATGAG- 
3'(SEQ ID: 67), 2 U of AmpliTaq Gold, and 200 ng of human genomic 
5 DNA. The template is denatured at 94°C for 8 minutes. Thermal cycling 
is continued with a touch-down program that included 1 1 cycles of 20 
seconds at 94°C, 30 seconds at 64°C, 1 minute at 72°C; and a final 
extension of 5 minutes at 72 °C. The amplicon is purified using 
UltraClean* PCR clean-up kit (MO Bio Laboratories, Solano Beach, CA). 

10 

EXAMPLE 9 

Multiplex Type IIS SNP Assay 

A Type IIS assay was used to identify human gene sequences with 
known SNPs. The Type IIS enzyme used in this assay was Fok I which 

15 effected double-stranded cleavage of the target DNA. The assay involved 
the steps of amplification and Fok I treatment of the amplicon. In the 
amplification step, the primers were designed so that each PCR product 
of a designated gene target was less than 100 bases such that a Fok I 
recognition sequence was incorporated at the 5' and 3' end of the 

20 amplicon. Therefore, the fragments that were cleaved by Fok I included a 
center fragment containing the SNP of interest. 

Ten human gene targets with known SNPs were analyzed by this 
assay. Sequences of the ten gene targets, as well as the primers used to 
amplify the target regions, are found in Table 5. The ten targets were 

25 lipoprotein lipase, prothrombin, factor V, cholesterol ester transfer protein 
(CETP), factor VII, factor XIII, HLA-H exon 2, HLA-H exon 4, 
methylenetetrahydrofolate reductase (MTHR), and P53 exon 4 codon 72. 

Amplification of the ten human gene sequences were carried out in 
a single 50 /vL volume PCR reaction with 20 ng of human genomic DNA 
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template in 5 PCR reaction tubes. Each reaction vial contained 1X PCR 
buffer (Qiagen), 200/yM dNTPs, 1U Hotstar Taq polymerase (Qiagen), 4 
mM MgCI 2 , and 10pmol of each primer. US8, having sequence of 
5'TCAGTCACGACGTT3'(SEQ ID NO: 68), and US9, having sequence of 
5 5'CGGATAACAATTTC3'(SEQ ID NO: 69), were used for the forward and 
reverse primers respectively. Moreover, the primers were designed such 
that a Fok I recognition site was incorporated at the 5' and 3' ends of the 
amplicon. Thermal cycling was performed in 0.2 mL tubes or a 96 well 
plate using a MJ Research Thermal Cycler (calculated temperature) with 
10 the following cycling parameters: 94°C for 5 minutes; 45 cycles: 94°C 
for 20 seconds, 56°C for 20 seconds, 72°C for 60 seconds; and 72°C 
for 3 minutes. 

Following PCR, the sample was treated with 0.2 U Exonuclease I 
(Amersham Pharmacia) and S Alkaline Phosphotase (Amersham 

15 Pharmacia) to remove the unincorporated primers and dNTPs. Typically, 
0.2 U of exonuclease I and SAP were added to 5 /yl_ of the PCR sample. 
The sample was then incubated at 37°C for 15 minutes. Exonuclease I 
and SAP were then inactivated by heating the sample up to 85 °C for 15 
minutes. Fok I digestion was performed by adding 2 U of Fok I (New 

20 England Biolab) to the 5 uL PCR sample and incubating at 37°C for 30 
minutes. Since the Fok I restriction sites are located on both sides of the 
amplicon, the 5' and 3' cutoff fragments have higher masses than the 
center fragment containing the SNP. The sample was then purified by 
anion exchange and analyzed by MALDI-TOF mass spectrometry as 

25 described in Example 4. The masses of the gene fragments from this 

multiplexing experiment are listed in Table 6. These gene fragments were 
resolved in mass spectra thereby allowing multiplex analysis of sequence 
variability in these genes. 

Table 5 

30 G n s f r Multiplex Typ IIS Assay 
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15 



Gene 


Sequence 


Seq. ID 
No. 


Primers 


Seq. 
ID No. 


Lipoprotein 
Lipase 

|Asn291Ser) 


cctttgagaa agggctctgc ttgagttgta 
gaaagaaccg ctgcaacaat 
ctqqqctatQ aqatcafa* qltaa aqtcaqaqcc 
aaaagaaqca acaaaatata 


98-99 


5* 

caatttcatcgctggatgcaatci 

gggctatgagatc 3' 

5' 

tttggctctgact3' 


70 
71 


Prothrombin 


26731 gaattatttttgtgtttctaaaactatggt 
tcccaataaa aqtqactctc 
26781 aaclq^alaqcctc aatqctccca 
ntqetattea tqqqcaactc tctqqqctca 


100- 
101 


5* 

tcagtcacgacgtiggatqccaa 
taaaaqtqactctcaqc 3' 

5* 

cggataacaatttcggatgcact 
qqqaqcattqaqqc 3' 


72 
73 


Factor V 

|Ar9&06Gln| 


taataggact acttctaatc tgtaagagca 
qatccctqqa caqqclq»a]aqoa 

atacaqqtat tftatccttaaaataacctt tcao 


102- 
103 


5' 

tcagtcacgacgttggatqaqca 
gatccctoaacaooc 3' 

5* 

cggataacaatttcggatggaca 

doaI«lCCIC|Y ailCC O 


74 
75 


Cholesterol ester 
transfer protein 
ICETPI 04OSV) 


1261 ctcaccatgg gcatttqatt qcaqaqcaqe 
tccqagtcclq»a) tccaqaqctt 

1311 cctqcaqtca atqatcaccq ctqtqqqcat 
ccctgaggtc atgtctcgta 


104- 
105 


5' 

tcagtcacgacgttqgatgcaqa 
gcagctccaaotc 3* 

5' 

cagcqqtqatcattqqatqcaqq 


76 
77 


Factor VII 
(R3S3Q) 


1221 agcaaggact ectgeaaggg ggacagtgga 
ggcccacatq ccacccacia 

1271 cc(a»g]gggcacg tqqtacctoa 
cqqqcatcat caactaaoac caaoactaca 


106- 
107 


5' 

tcagtcacgacgttgqatgccca 
catgccacccactac 3' 

5' 

cggataacaatttcqqatqcccq 


78 
79 


Factor XIII 
CV34LI 


1 1 1 caataactct aatgeagegg aagatgacct 
qcccacaqta qaqcttcaqq 

161 gdg^thggtgcc ccggggcgtc 
aacctqeaaq gtatgaqcat accccccttc 


108- 
109 


5' 

tcagtcacgacgttgqatgccca 
cagtqgaqcttcaq 3' 

5' 

gctcataccttqcaqqatoacq 

3' 


80 
81 


HLA-H «ion 2 

(His63Asp) 


361 tigaagcttt gggctacgtg gatqaccaqc 
tgttcgtgtt ctatoat[c»a)at 

411 gaqaqtcqcc gtqtqqaocc ccqaactcca 
tgggtucca gtagaatttc 


110- 
111 


5' 

tcagtcacqacmtqqatqacca 
qctqttcqtqttc3' 

5* 

tacatqqaqttcaqqqatqcaca 
cqqcqactctc 3' 


82 
83 


HLA-H exon 4 

(Cys282Tyr> 


1021 ggataaccrt ggctgtaccc cctqqqqaaq 
aacaqaaata tacgt(q»alccao 

1071 qtqqaacacc caqqcctooa tcagcccctc 
attgtgatct gggagccctc 


1 12- 
113 


5' 

tcagtcacgacgttgqatgggga 
agagcagaqatatacqt 3' 

5' 

qaqqqqctqatccaqqatqqqt 
gctccac 3' 


84 
85 
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Gene 


Sequence 


Seq. ID 
No. 


Primers 


Seq. 
ID No. 


Methytenteirahy 


76 1 tqaagcactt gaagqa gaag qtqtctgcgg 


1 14- 


5' 


86 


drof ola te red etas 


aaalc Mlcqattt catcat cacq 


115 


tcagtcacgacgttggatgggqa 




e (MTHR) 






aqaqceqaqatatacgt 3' 




|AJa222Val) 


81 1 caqctmctttgagoctoa cacattcttc 














5' 


87 








qagqqqctqotccaqqatqqqt 










gctccac 3' 




P53 Exon4 
Codon 72 
(Arg72Pro) 


12101 tccagatgaa getcecagaa 
tqccagaqqc tgctcccc(a»clc gtggcccctg 

12151 caccagcaac tcctacaccg 
gcqqccccta 


116- 
117 


5* 

qatqaaqctcccaqqatqccaq 
aqqc 3' 

5' 

qccqccqqtqtaqqatqctqctq 
qtqc 3' 


88 
89 



WO 01/27857 



PCT/US00/28413 



-89/a- 



Table 6 

The mass of Center Fragments for Ten Different SNP Typing by 

IIS Assay 
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EXAMPLE 10 

Exemplary use of parental medical history parameter for stratification of healthy 
datebase 

A healthy database can be used to associate a disease state with a 
specific allele (SNP) that has been found to show a strong association between 
age and the allele, in particular the homozygous genotype. The method involves 
using the same healthy database used to identify the age dependent association, 
however stratification is by information given by the donors about common 
disorders from which their parents suffered (the donor's familial history of 
disease). There are three possible answers a donor could give about the health 
status of their parents: neither were affected, one was affected or both were 
affected. Only donors above a certain minimum age, depending on the disease, 
are utilized, as the donors parents must be old enough to to have exhibited 
clinical disease phenotypes. The genotype frequency in each of these groups is 
determined and compared with each other. If there is an association of the 
marker in the donor to a disease the frequency of the heterozyous genotype will 
be increased. The frequency of the homozygous genotype should not increase, 
as it should be significantly underrepresented in the healthy population. 
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EXAMPLE 1 1 

Method and Device for Identifying a Biological Sample 
Description 

In accordance with the present invention, a method and device for 
5 identifying a biological sample is provided. Referring now to FIG. 24, an 

apparatus 10 for identifying a biological sample is disclosed. The apparatus 10 
for identifying a biological sample generally comprises a mass spectrometer 1 5 
communicating with a computing device 20. In a preferred embodiment, the 
mass spectrometer may be a MALDI-TOF mass spectrometer manufactured by 
10 Bruker-Franzen Analytik GmbH; however, it will be appreciated that other mass 
spectrometers can be substituted. The computing device 20 is preferably a 
general purpose computing device. However, it will be appreciated that the 
computing device could be alternatively configured, for example, it may be 
integrated with the mass spectrometer or could be part of a computer in a larger 
15 network system. 

The apparatus 1 0 for identifying a biological sample may operate as an 
automated identification system having a robot 25 with a robotic arm 27 
configured to deliver a sample plate 29 into a receiving area 31 of the mass 
spectrometer 15. In such a manner, the sample to be identified may be placed 
20 on the plate 29 and automatically received into the mass spectrometer 15. The 
biological sample is then processed in the mass spectrometer to generate data 
indicative of the mass of DNA fragments in the biological sample. This data may 
be sent directly to computing device 20, or may have some preprocessing or 
filtering performed within the mass spectrometer. In a preferred embodiment, 
25 the mass spectrometer 1 5 transmits unprocessed and unfiltered mass 

spectrometry data to the computing device 20. However, it will be appreciated 
that the analysis in the computing device may be adjusted to accommodate 
preprocessing or filtering performed within the mass spectrometer. 

Referring now to FIG. 25, a general method 35 for identifying a biological 
30 sample is shown. In method 35, data is received into a computing device from a 
test instrument in block 40. Preferably the data is received in a raw, 
unprocessed and unfiltered form, but alternatively may have some form of 
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filtering or processing applied. The test instrument of a preferred embodiment is 
a mass spectrometer as described above. However, it will be appreciated that 
other test instruments could be substituted for the mass spectrometer. 

The data generated by the test instrument, and in particular the mass 
5 spectrometer, includes information indicative of the identification of the 
biological sample. More specifically, the data is indicative of the DNA 
composition of the biological sample. Typically, mass spectrometry data 
gathered from DNA samples obtained from DNA amplification techniques are 
noisier than, for example, those from typical protein samples. This is due in part 
10 because protein samples are more readily prepared in more abundance, and 
protein samples are more easily ionizable as compared to DNA samples. 
Accordingly, conventional mass spectrometer data analysis techniques are 
generally ineffective for DNA analysis of a biological sample. To improve the 
analysis capability so that DNA composition data can be more readily discerned, 
15 a preferred embodiment uses wavelet technology for analyzing the DNA mass 
spectrometry data. Wavelets are an analytical tool for signal processing, 
numerical analysis, and mathematical modeling. Wavelet technology provides a 
basic expansion function which is applied to a data set. Using wavelet 
decomposition, the data set can be simultaneously analyzed in the time and 
20 frequency domains. Wavelet transformation is the technique of choice in the 
analysis of data that exhibit complicated time (mass) and frequency domain 
information, such as MALDI-TOF DNA data. Wavelet transforms as described 
herein have superior denoising properties as compared to conventional Fourier 
analysis techniques. Wavelet transformation has proven to be particularly 
25 effective in interpreting the inherently noisy MALDI-TOF spectra of DNA 

samples. In using wavelets, a "small wave" or "scaling function" is used to 
transform a data set into stages, with each stage representing a frequency 
component in the data set. Using wavelet transformation, mass spectrometry 
data can be processed, filtered, and analyzed with sufficient discrimination to be 
30 useful for identification of the DNA composition for a biological sample. 

Referring again to FIG. 25, the data received in block 40 is denoised in 
block 45. The denoised data then has a baseline correction applied in block 50. 
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A baseline correction is generally necessary as data coming from the test 
instrument, in particular a mass spectrometer instrument, has data arranged in a 
generally exponentially decaying manner. This generally exponential decaying 
arrangement is not due to the composition of the biological sample, but is a 
5 result of the physical properties and characteristics of the test instrument, and 
other chemicaJs involved in DNA sample preparation. Accordingly, baseline 
correction substantially corrects the data to remove a component of the data 
attributable to the test system, and sample preparation characteristics. 

After denoising in block 45 and the baseline correction in block 50, a 

10 signal remains which is generally indicative of the composition of the biological 
sample. However, due to the extraordinary discrimination required for analyzing 
the DNA composition of the biological sample, the composition is not readily 
apparent from the denoised and corrected signal. For example, although the 
signal may include peak areas, it is not yet clear whether these "putative" peaks 

1 5 actually represent a DNA composition, or whether the putative peaks are result 
of a systemic or chemical aberration. Further, any call of the composition of the 
biological sample would have a probability of error which would be unacceptable 
for clinical or therapeutic purposes. In such critical situations, there needs to be 
a high degree of certainty that any call or identification of the sample is 

20 accurate. Therefore, additional data processing and interpretation is necessary 
before the sample can be accurately and confidently identified. 

Since the quantity of data resulting from each mass spectrometry test is 
typically thousands of data points, and an automated system may be set to 
perform hundreds or even thousands of tests per hour, the quantity of mass 

25 spectrometry data generated is enormous. To facilitate efficient transmission 
and storage of the mass spectrometry data, block 55 shows that the denoised 
and baseline corrected data is compressed. 

In a preferred embodiment, the biological sample is selected and 
processed to have only a limited range of possible compositions. Accordingly, it 

30 is therefore known where peaks indicating composition should be located, if 

present. Taking advantage of knowing the location of these expected peaks, in 
block 60 the method 35 matches putative peaks in the processed signal to the 
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location of the expected peaks. In such a manner, the probability of each 
putative peak in the data being an actual peak indicative of the composition of 
the biological sample can be determined. Once the probability of each peak is 
determined in block 60, then in block 65 the method 35 statistically determines 
5 the composition of the biological sample, and determines if confidence is high 
enough to calling a genotype. 

Referring again to block 40, data is received from the test instrument, 
which is preferably a mass spectrometer. In a specific illustration, FIG. 26 
shows an example of data from a mass spectrometer. The mass spectrometer 
10 data 70 generally comprises data points distributed along an x-axis 71 and a y- 
axis 72. The x-axis 71 represents the mass of particles detected, while the y- 
axis 72 represents a numerical concentration of the particles. As can be seen in 
FIG. 26, the mass spectrometry data 70 is generally exponentially decaying with 
data at the left end of the x-axis 73 generally decaying in an exponential manner 

1 5 toward data at the heavier end 74 of the x-axis 71 . However, the general 

exponential presentation of the data is not indicative of the composition of the 
biological sample, but is more reflective of systematic error and characteristics. 
Further, as described above and illustrated in FIG. 26, considerable noise exists 
in the mass spectrometry DNA data 70. 

20 Referring again to block 45, where the raw data received in block 40 is 

denoised, the denoising process will be described in more detail. As illustrated 
in FIG. 25, the denoising process generally entails 1) performing a wavelet 
transformation on the raw data to decompose the raw data into wavelet stage 
coefficients; 2) generating a noise profile from the highest stage of wavelet 

25 coefficients; and 3) applying a scaled noise profile to other stages in the wavelet 
transformation. Each step of the denoising process is further described below. 

Referring now to FIG. 27, the wavelet transformation of the raw mass 
spectrometry data is generally diagramed. Using wavelet transformation 
techniques, the mass spectrometry data 70 is sequentially transformed into 

30 stages. In each stage the data is represented in a high stage and a low stage, 
with the low stage acting as the input to the next sequential stage. For 
example, the mass spectrometry data 70 is transformed into stage 0 high data 
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82 and stage 0 tow data 83. The stage 0 low data 83 is then used as an input 
to the next level transformation to generate stage 1 high data 84 and stage 1 
low data 85. In a similar manner, the stage 1 low data 85 is used as an input to 
be transformed into stage 2 high data 86 and stage 2 low data 87. The 
5 transformation is continued until no more useful information can be derived by 
further wavelet transformation. For example, in the preferred embodiment a 24- 
point wavelet is used. More particularly a wavelet commonly referred to as the 
Daubechies 24 is used to decompose the raw data. However, it will be 
appreciated that other wavelets can be used for the wavelet transformation. 

1 0 Since each stage in a wavelet transformation has one-half the data points of the 
previous stage, the wavelet transformation can be continued until the stage n 
low data 89 has around 50 points. Accordingly, the stage n high 88 would 
contain about 100 data points. Since the preferred wavelet is 24 points long, 
little data or information can be derived by continuing the wavelet transformation 

15 on a data set of around 50 points. 

FIG. 28 shows an example of stage 0 high data 95. Since stage 0 high 
data 95 is generally indicative of the highest frequencies in the mass 
spectrometry data, stage 0 high data 95 will closely relate to the quantity of 
high frequency noise in the mass spectrometry data. In FIG. 29, an exponential 

20 fitting formula has been applied to the stage 0 high data 95 to generate a stage 
0 noise profile 97. In particular, the exponential fitting formula is in the format 
A 0 -i- A) EXP (-A 2 m). It will be appreciated that other expediential fitting 
formulas or other types of curve fits may be used. 

Referring now to FIG. 30, noise profiles for the other high stages are 

25 determined. Since the later data points in each stage will likely be representative 
of the level of noise in each stage, only the later data points in each stage are 
used to generate a standard deviation figure that is representative of the noise 
content in that particular stage. More particularly, in generating the noise profile 
for each remaining stage, only the last five percent of the data points in each 

30 stage are analyzed to determined a standard deviation number. It will be 

appreciated that other numbers of points, or alternative methods could be us d 
to generate such a standard deviation figure. 
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The standard deviation number for each stage is used with the stage 0 
noise profile (the exponential curve) 97 to generate a scaled noise profile for 
each stage. For example, FIG. 30 shows that stage 1 high data 98 has stage 1 
high data 103 with the last five percent of the data points represented by area 
5 99. The points in area 99 are evaluated to determine a standard deviation 

number indicative of the noise content in stage 1 high data 103. The standard 
deviation number is then used with the stage 0 noise profile 97 to generate a 
stage 1 noise profile. 

In a similar manner, stage 2 high 100 has stage 2 high data 104 with the 

10 last five percent of points represented by area 101 . The data points in area 101 
are then used to calculate a standard deviation number which is then used to 
scale the stage 0 noise profile 97 to generate a noise profile for stage 2 data. 
This same process is continued for each of the stage high data as shown by the 
stage n high 105. For stage n high 105, stage n high data 108 has the last five 

15 percent of data points indicated in area 106. The data points in area 106 are 
used to determine a standard deviation number for stage n. The stage n 
standard deviation number is then used with the stage 0 noise profile 97 to 
generate a noise profile for stage n. Accordingly, each of the high data stages 
has a noise profile. 

20 FIG. 31 shows how the noise profile is applied to the data in each stage. 

Generally, the noise profile is used to generate a threshold which is applied to 
the data in each stage. Since the noise profile is already scaled to adjust for the 
noise content of each stage, calculating a threshold permits further adjustment 
to tune the quantity of noise removed. Wavelet coefficients below the threshold 

25 are ignored while those above the threshold are retained. Accordingly, the 
remaining data has a substantial portion of the noise content removed. 

Due to the characteristics of wavelet transformation, the lower stages, 
such as stage 0 and 1 , will have more noise content than the later stages such 
as stage 2 or stage n. Indeed, stage n low data is likely to have little noise at 

30 all. Therefore, in a preferred embodiment the noise profiles are applied more 
aggressively in the I w r stages and less aggressively in the later stages. For 
example, FIG. 31 shows that stage 0 high threshold is determined by multiplying 
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the stage 0 noise profile by a factor of four. In such a manner, significant 
numbers of data points in stage 0 high data 95 will be below the threshold and 
therefore eliminated. Stage 1 high threshold 1 1 2 is set at two times the noise 
profile for the stage 1 high data, and stage 2 high threshold 1 14 is set equal to 
5 the noise profile for stage 2 high. Following this geometric progression, stage n 
high threshold 1 1 6 is therefore determined by scaling the noise profile for each 
respective stage n high by a factor equal to (1/2 n2 ). It will be appreciated that 
other factors may be applied to scale the noise profile for each stage. For 
example, the noise profile may be scaled more or less aggressively to 

10 accommodate specific systemic characteristics or sample compositions. As 

indicated above, stage n low data does not have a noise profile applied as stage 
n low data 1 18 is assumed to have little or no noise content. After the scaled 
noise profiles have been applied to each high data stage, the mass spectrometry 
data 70 has been denoised and is ready for further processing. A wavelet 

1 5 transformation of the denoised signal results in the sparse data set 1 20 as 
shown in FIG. 31 . 

Referring again to FIG. 25, the mass spectrometry data received in block 
40 has been denoised in block 45 and is now passed to block 50 for baseline 
correction. Before performing baseline correction, the artifacts introduced by the 

20 wavelet transformation procedure are preferably removed. Wavelet 

transformation results vary slightly depending upon which point of the wavelet is 
used as a starting point. For example, the preferred embodiment uses the 24- 
point Daubechies-24 wavelet. By starting the transformation at the 0 point of 
the wavelet, a slightly different result will be obtained than if starting at points 1 

25 or 2 of the wavelet. Therefore, the denoised data is transformed using every 
available possible starting point, with the results averaged to determine a final 
denoised and shifted signal. For example, FIG. 33 shows that the wavelet 
coefficient is applied 24 different times and then the results averaged to 
generate the final data set. It will be appreciated that other techniques may be 

30 used to accommodate the slight error introduced due to wavelet shifting. 

The formula 125 is generally indicated in FIG. 33. Once the signal has 
been denoised and shifted, a denoised and shifted signal 1 30 is generated as 
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shown in FIG. 58. FIG. 34 shows an example of the wavelet coefficient 135 
data set from the denoised and shifted signal 1 30. 

FIG. 36 shows that putative peak areas 145, 147, and 149 are located in 
the denoised and shifted signal 1 50. The putative peak areas are systematically 
5 identified by taking a moving average along the signal 1 50 and identifying 
sections of the signal 1 50 which exceed a threshold related to the moving 
average. It will be appreciated that other methods can be used to identify 
putative peak areas in the signal 1 50. 

Putative peak areas 145, 147 and 149 are removed from the signal 150 
10 to create a peak-free signal 155 as shown in FIG. 37. The peak-free signal 155 
is further analyzed to identify remaining minimum values 157, and the remaining 
minimum values 157 are connected to generate the peak-free signal 155. 

FIG. 38 shows a process of using the peak-free signal 155 to generate a 
baseline 170 as shown in FIG. 39. As shown in block 162, a wavelet 
15 transformation is performed on the peak-free signal 155. All the stages from the 
wavelet transformation are eliminated in block 164 except for the n low stage. 
The n low stage will generally indicate the lowest frequency component of the 
peak-free signal 155 and therefore will generally indicate the system exponential 
characteristics. Block 166 shows that a signal is reconstructed from the n low 
20 coefficients and the baseline signal 170 is generated in block 168. 

FIG. 39 shows a denoised and shifted data signal 172 positioned adjacent 
a correction baseline 170. The baseline correction 170 is subtracted from the 
denoised and shifted signal 172 to generate a signal 175 having a baseline 
correction applied as shown in FIG. 40. Although such a denoised, shifted, and 
25 corrected signal is sufficient for most identification purposes, the putative peaks 
in signal 1 75 are not identifiable with sufficient accuracy or confidence to call 
the DNA composition of a biological sample. 

Referring again to FIG. 25, the data from the baseline correction 50 is 
now compressed in block 55, the compression technique used in a preferred 
30 embodiment is detailed in FIG. 41. In FIG. 41 the data in the baseline corrected 
data is presented in an array format 182 with x-axis points 183 having an 
associated data value 1 84. The x-axis is index d by the non-zero wavelet 



WO 01/27857 



PCT/US00/28413 



-99- 

coefficients, and the associated value is the value of the wavelet coefficient, in 
the illustrated data example in table 182, the maximum value 184 is indicated to 
be 1000. Although a particularly advantageous compression technique for mass 
spectrometry data is shown, it will be appreciated that other compression 
5 techniques can be used. Although not preferred, the data may also be stored 
without compression. 

In compressing the data according to a preferred embodiment, an 
intermediate format 186 is generated. The intermediate format 186 generally 
comprises a real number having a whole number portion 1 88 and a decimal 
10 portion 190. The whole number portion is the x-axis point 183 while the 

decimal portion is the value data 184 divided by the maximum data value. For 
example, in the data 182 a data value "25" is indicated at x-axis point "100". 
The intermediate value for this data point would be "100.025". 

From the intermediate compressed data 186 the final compressed data 

15 195 is generated. The first point of the intermediate data file becomes the 
starting point for the compressed data. Thereafter each data point in the 
compressed data 195 is calculated as follows: the whole number portion (left of 
the decimal) is replaced by the difference between the current and the last whole 
number. The remainder (right of the decimal) remains intact. For example, the 

20 starting point of the compressed data 195 is shown to be the same as the 

intermediate data point which is "100.025". The comparison between the first 
intermediate data point "100.025" and the second intermediate data point 
"150.220" is "50.220". Therefore, "50.220" becomes the second point of the 
compressed data 195. In a similar manner, the second intermediate point is 

25 "150.220" and the third intermediate data point is "500.0001". Therefore, the 
third compressed data becomes "350.000". The calculation for determining 
compressed data points is continued until the entire array of data points is 
converted to a single array of real numbers. 

FIG. 42 generally describes the method of compressing mass 

30 spectrometry data, showing that the data file in block 201 is presented as an 
array of coefficients in block 202. The data starting point and maximum is 
determined as shown in block 203, and the intermediate real numbers are 
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calculated in block 204 as described above. With the intermediate data points 
generated, the compressed data is generated in block 205. The described 
compression method is highly advantageous and efficient for compressing data 
sets such as a processed data set from a mass spectrometry instrument. The 
5 method is particularly useful for data, such as mass spectrometry data, that uses 
large numbers and has been processed to have occasional lengthy gaps in x-axis 
. data. Accordingly, an x-y data array for processed mass spectrometry data may 
be stored with an effective compression rate of 10x or more. Although the 
compression technique is applied to mass spectrometry data, it will be 
10 appreciated that the method may also advantageously be applied to other data 
sets. 

Referring again to FIG. 25, peak heights are now determined in block 60. 
The first step in determining peak height is illustrated in FIG. 43 where the signal 
210 is shifted left or right to correspond with the position of expected peaks. 
15 As the set of possible compositions in the biological sample is known before the 
mass spectrometry data is generated, the possible positioning of expected peaks 
is already known. These possible peaks are referred to as expected peaks, such 
as expected peaks 212, 214, and 216. Due to calibration or other errors in the 
test instrument data, the entire signal may be shifted left or right from its actual 

20 position, therefore, putative peaks located in the signal, such as putative peaks 
218, 222, and 224 may be compared to the expected peaks 212, 214, and 216, 
respectively. The entire signal is then shifted such that the putative peaks align 
more closely with the expected peaks. 

Once the putative peaks have been shifted to match expected peaks, the 

25 strongest putative peak is identified in FIG. 44. In a preferred embodiment, the 
strongest peak is calculated as a combination of analyzing the overall peak 
height and area beneath the peak. For example, a moderately high but wide 
peak would be stronger than a very high peak that is extremely narrow. With 
the strongest putative peak identified, such as putative peak 225, a Gaussian 

30 228 curve is fit to the peak 225. Once the Gaussian is fit, the width (W) of the 
Gaussian is determined and will be used as the peak width for future 
calculations. 



MSDOC!D- <WO 01?7eF7A? t > 



WO 01/27857 



PCT/US00/28413 



-101- 

As generally addressed above, the denoised, shifted, and baseline- 
corrected signal is not sufficiently processed for confidently calling the DNA 
composition of the biological sample. For example, although the baseline has 
generally been removed, there are still residual baseline effects present. These 
5 residual baseline effects are therefore removed to increase the accuracy and 
confidence in making identifications. 

To remove the residual baseline effects, FIG. 45 shows that the putative 
peaks 218, 222, and 224 are removed from the baseline corrected signal. The 
peaks are removed by identifying a center line 230, 232, and 234 of the 
10 putative peaks 218, 222, and 224, respectively and removing an area to the left 
and to the right of the identified center line. For each putative peak, an area 
equal to twice the width <W) of the Gaussian is removed from the left of the 
center line, while an area equivalent to 50 daltons is removed from the right of 
the center line. It has been found that the area representing 50 daltons is 
15 adequate to sufficiently remove the effect of salt adducts which may be 

associated with an actual peak. Such adducts appear to the right of an actual 
peak and are a natural effect from the chemistry involved in acquiring a mass 
spectrum. Although a 50 Dalton buffer has been selected, it will be appreciated 
that other ranges or methods can be used to reduce or eliminate adduct effects. 
20 The peaks are removed and remaining minima 247 located as shown in 

FIG. 46 with the minima 247 connected to create signal 245. A quartic 
polynomial is applied to signal 245 to generate a residual baseline 250 as shown 
in FIG. 47. The residual baseline 250 is subtracted from the signal 225 to 
generate the final signal 255 as indicated in FIG. 48. Although the residual 
25 baseline is the result of a quartic fit to signal 245, it will be appreciated that 
other techniques can be used to smooth or fit the residual baseline. 

To determine peak height, as shown in FIG. 49, a Gaussian such as 
Gaussian 266, 268, and 270 is fit to each of the peaks, such as peaks 260, 
262, and 264, respectively. Accordingly, the height of the Gaussian is 
30 determined as height 272, 274, and 276. Once the height of each Gaussian 

peak is determined, then the method of identifying a biological compound 35 can 
move into the genotyping phase 65 as shown in FIG. 25. 
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An indication of the confidence that each putative peak is an actual peak 
can be discerned by calculating a signal-to-noise ratio for each putative peak. 
Accordingly, putative peaks with a strong signal-to-noise ratio are generally more 
likely to be an actual peak than a putative peak with a lower signal-to-noise 
5 ratio. As described above and shown in FIG. 50, the height of each peak, such 
as height 272, 274, and 276, is determined for each peak, with the height being 
an indicator of signal strength for each peak. The noise profile, such as noise 
profile 97, is extrapolated into noise profile 280 across the identified peaks. At 
the center line of each of the peaks, a noise value is determined, such as noise 
10 value 282, 283, and 284. With a signal values and a noise values generated, 
signal-to-noise ratios can be calculated for each peak. For example, the signal- 
to-noise ratio for the first peak in FIG. 50 would be calculated as signal value 
272 divided by noise value 282, and in a similar manner the signal-to-noise ratio 
of the middle peak in FIG. 50 would be determined as signal 274 divided by 
15 noise value 283. 

Although the signal-to-noise ratio is generally a useful indicator of the 
presence of an actual peak, further processing has been found to increase the 
confidence by which a sample can be identified. For example, the signal-to- 
noise ratio for each peak in the preferred embodiment is preferably adjusted by 
20 the goodness of fit between a Gaussian and each putative peak. It is a 

characteristic of a mass spectrometer that sample material is detected in a 
manner that generally complies with a normal distribution. Accordingly, greater 
confidence will be associated with a putative signal having a Gaussian shape 
than a signal that has a less normal distribution. The error resulting from having 
25 a non-Gaussian shape can be referred to as a "residual error". 

Referring to FIG. 51, a residual error is calculated by taking a root mean 
square calculation between the Gaussian 293 and the putative peak 290 in the 
data signal. The calculation is performed on data within one width on either side 
of a center line of the Gaussian. The residual error is calculated as: 
30 where G is the Gaussian signal value, R is the putative peak value, and N 

is the number of points from -W to + W. The calculated residual err r is 
used to generate an adjusted signal-to-noise ratio, as described below. 
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An adjusted signal noise ratio is calculated for each putative peak using 
the formula (S/N) * EXP <_ 1 " R \ where S/N is the signal-to-noise ratio, and R is 
the residual error determined above. Although the preferred embodiment 
calculates an adjusted signal-to-noise ratio using a residual error for each peak, it 
5 will be appreciated that other techniques can be used to account for the 
goodness o v f fit between the Gaussian and the actual signal. 

Referring now to FIG. 52, a probability is determined that a putative peak 
is an actual peak. In making the determination of peak probability, a probability 
profile 300 is generated where the adjusted signal-to-noise ratio is the x-axis and 
10 the probability is the y-axis. Probability is necessarily in the range between a 
0% probability and a 100% probability, which is indicated as 1. Generally, the 
higher the adjusted signal-to-noise ratio, the greater the confidence that a 
putative peak is an actual peak. 

At some target value for the adjusted signal-to-noise, it has been found 
15 that the probability is 100% that the putative peak is an actual peak and can 
confidently be used to identify the DNA composition of a biological sample. 
However, the target value of adjusted signal-to-noise ratio where the probability 
is assumed to be 100% is a variable parameter which is to be set according to 
application specific criteria. For example, the target signal-to-noise ratio will be 
20 adjusted depending upon trial experience, sample characteristics, and the 

acceptable error tolerance in the overall system. More specifically, for situations 
requiring a conservative approach where error cannot be tolerated, the target 
adjusted signal-to-noise ratio can be set to, for example, 10 and higher. 
Accordingly, 100% probability will not be assigned to a peak unless the adjusted 
25 signal-to-noise ratio is 10 or over. 

In other situations, a more aggressive approach may be taken as sample 
data is more pronounced or the risk of error may be reduced. In such a 
situation, the system may be set to assume a 100% probability with a 5 or 
greater target signal-to-noise ratio. Of course, an intermediate signal-to-noise 
30 ratio target figure can be selected, such as 7, when a moderate risk of error can 
be assumed. Once the target adjusted signal-to-noise ratio is set for the method, 
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then for any adjusted signal-to-noise ratio a probability can be determined that a 
putative peak is an actual peak. 

Due to the chemistry involved in performing an identification test, 
especially a mass spectrometry test of a sample prepared by DNA amplifications, 
5 the allelic ratio between the signal strength of the highest peak and the signal 
strength of the second (or third and so on) highest peak should fall within an 
expected ratio. If the allelic ratio falls outside of normal guidelines, the preferred 
embodiment imposes an allelic ratio penalty to the probability. For example, 
FIG. 53 shows an allelic penalty 315 which has an x-axis 317 that is the ratio 
10 between the signal strength of the second highest peak divided by signal 

strength of the highest peak. The y-axis 319 assigns a penalty between 0 and 1 
depending on the determined allelic ratio. In the preferred embodiment, it is 
assumed that allelic ratios over 30% are within the expected range and therefore 
no penalty is applied. Between a ratio of 10% and 30%, the penalty is linearly 

15 increased until at allelic ratios below 10% it is assumed the second-highest peak 
is not real. For allelic ratios between 10% and 30%, the allelic penalty chart 
315 is used to determine a penalty 319, which is multiplied by the peak 
probability determined in FIG. 52 to determine a final peak probability. Although 
the preferred embodiment incorporates an allelic ratio penalty to account for a 

20 possible chemistry error, it will be appreciated that other techniques may be 
used. Similar treatment will be applied to the other peaks. 

With the peak probability of each peak determined, the statistical 
probability for various composition components may be determined. As an 
example, in order to determine the probability of each of three possible 

25 combinations of two peaks, peak G, peak C and combinations GG, CC and 

GC. FIG. 54 shows an example where a most probable peak 325 is determined 
to have a final peak probability of 90%. Peak 325 is positioned such that it 
represents a G component in the biological sample. Accordingly, it can be 
maintained that there is a 90% probability that G exists in the biological sample. 

30 Also in the example shown in FIG. 54, the second highest probability is peak 

330 which has a peak probability of 20%. Peak 330 is at a position associated 
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with a C composition. Accordingly, it can be maintained that there is a 20% 
probability that C exists in the biological sample. 

With the probability of G existing (90%) and the probability of C existing 
(20%) as a starting point, the probability of combinations of G and C existing 
5 can be calculated. For example, FIG. 54 indicates that the probability of GG 
existing 329 is calculated as 72%. This is calculated as the probability of GG is 
equal to the probability of G existing (90%) multiplied by the probability of C not 
existing (100% -20%). So if the probability of G existing is 90% and the 
probability of C not existing is 80%, the probability of GG is 72%. 
10 In a similar manner, the probability of CC existing is equivalent to the 

probability of C existing (20%) multiplied by the probability of G not existing 
(100% - 90%). As shown in FIG. 54, the probability of C existing is 20% while 
the probability of G not existing is 10%, so therefore the probability of CC is 
only 2%. Finally, the probability of GC existing is equal to the probability of G 
15 existing (90%) multiplied by the probability of C existing (20%). So if the 

probability of G existing is 90% and the probability of C existing is 20%, the 
probability of GC existing is 18%. In summary form, then, the probability of the 
composition of the biological sample is: 
probability of GG: 72%; 
20 probability of GC: 18%; and 

probability of CC: 2%. 
Once the probabilities of each of the possible combinations has been 
determined, FIG. 55 is used to decide whether or not sufficient confidence exists 
to call the genotype. FIG. 55 shows a call chart 335 which has an x-axis 337 
25 which is the ratio of the highest combination probability to the second highest 
combination probability. The y-axis 339 simply indicates whether the ratio is 
sufficiently high to justify calling the genotype. The value of the ratio may be 
indicated by M 340. The value of M is set depending upon trial data, sample 
composition, and the ability to accept error. For example, the value M may be 
30 set relatively high, such as to a value 4 so that the highest probability must be at 
least four times greater than the second highest probability before confidence is 
established to call a genotype. However, if a certain lev I of error may be 
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acceptable, the value of M may be set to a more aggressive value, such as to 3, 
so that the ratio between the highest and second highest probabilities needs to 
be only a ratio of 3 or higher. Of course, moderate value may be selected for M 
when a moderate risk can be accepted. Using the example of FIG. 54, where 
5 the probability of GG was 72% and the probability of GC was 18%, the ratio 
between 72% and 18% is 4.0, therefore, whether M is set to 3, 3.5, or 4, the 
system would call the genotype as GG. Although the preferred embodiment 
uses a ratio between the two highest peak probabilities to determine if a 
genotype confidently can be called, it will be appreciated that other methods 

10 may be substituted. It will also be appreciated that the above techniques may 
be used for calculating probabilities and choosing genotypes (or more general 
DNA patterns) containing of combinations of more than two peaks. 

Referring now to FIG. 56, a flow chart is shown generally defining the 
process of statistically calling genotype described above. In FIG. 56 block 402 

1 5 shows that the height of each peak is determined and that in block 404 a noise 
profile is extrapolated for each peak. The signal is determined from the height of 
each peak in block 406 and the noise for each peak is determined using the 
noise profile in block 408. In block 410, the signal-to-noise ratio is calculated 
for each peak. To account for a non-Gaussian peak shape, a residual error is 

20 determined in block 412 and an adjusted signal-to-noise ratio is calculated in 
block 414. Block 416 shows that a probability profile is developed, with the 
probability of each peak existing found in block 418. An allelic penalty may be 
applied in block 420, with the allelic penalty applied to the adjusted peak 
probability in block 422. The probability of each combination of components is 

25 calculated in block 424 with the ratio between the two highest probabilities 

being determined in block 426. If the ratio of probabilities exceeds a threshold 
value then the genotype is called in block 428. 

In another embodiment of the invention, the computing device 20 (Fig. 
30 24) supports "standardless" genotyping by identifying data peaks that contain 
putative SNPs. Standardless genotyping is used, for example, where insufficient 
information is known about the samples to determine a distribution of expect d 
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peak locations, against which an allelic penalty as described above can be 
reliably calculated. This permits the computing device to be used for 
identification of peaks that contain putative SNPs from data generated by any 
assay that fragments a targeted DNA molecule. For such standardless 
5 genotyping, peaks that are associated with an area under the data curve that 
deviates significantly from the typical area of other peaks in the data spectrum 
are identified and their corresponding mass (location along the x-axis) is 
determined. 

More particularly, peaks that deviate significantly from the average area 
10 of other peaks in the data are identified, and the expected allelic ratio between 
data peaks is defined in terms of the ratio of the area under the data peaks. 
Theoretically, where each genetic loci has the same molar concentration of 
analyte, the area under each corresponding peak should be the same, thus 
producing a 1 .0 ratio of the peak area between any two peaks. In accordance 

1 5 with the invention, peaks having a smaller ratio relative to the other peaks in the 
data will not be recognized as peaks. More particularly, peaks having an area 
ratio smaller than 30% relative to a nominal value for peak area will be assigned 
an allelic penalty. The mass of the remaining peaks (their location along the x- 
axis of the data) will be determined based on oligonucleotide standards. 

20 Fig. 57 shows a flow diagram representation of the processing by the 

computing device 20 (Fig. 24) when performing standardless genotyping. In the 
first operation, represented by the flow diagram box numbered 502, the 
computing device receives data from the mass spectrometer. Next, the height 
of each putative peak in the data sample is determined, as indicated by the block 

25 504. After the height of each peak in the mass spectrometer data is 
determined, a de-noise process 505 is performed, beginning with an 
extrapolation of the noise profile (block 506), followed by finding the noise of 
each peak (block 508) and calculating the signal to noise ratio for each data 
sample (block 510). Each of these operations may be performed in accordance 

30 with the description above for denoise operations 45 of Fig. 25. Other suitable 
denoise operations will occur to those skilled in the art. 
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The next operation is to find the residual error associated with each data 
point. This is represented by the block 512 in Figure 57. The next step, block 
514, involves calculating an adjusted signal to noise ratio for each identified 
peak. A probability profile is developed next (block 516), followed by a 
5 determination of the peak probabilities at block 518. In the preferred 

embodiment, the denoise operations of Fig. 57, comprising block 502 to block 
518, comprise the corresponding operations described above in conjunction with 
Fig. 56 for block 402 through block 418, respectively. 

The next action for the standardless genotype processing is to determine 
10 an allelic penalty for each peak, indicated by the block 524. As noted above, 

the standardless genotype processing of Fig. 57 determines an allelic penalty by 
comparing area under the peaks. Therefore, rather than compare signal strength 
ratios to determine an allelic penalty, such as described above for Fig. 53, the 
standardless processing determines the area under each of the identified peaks 
15 and compares the ratio of those areas. Determining the area under each peak 
may be computed using conventional numerical analysis techniques for 
calculating the area under a curve for experimental data. 

Thus, the allelic penalty is assigned in accordance with Fig. 58, which 
shows that no penalty is assigned to peaks having a peak area relative to an 
20 expected average area value that is greater than 0.30 (30%). The allelic penalty 
is applied to the peak probability value, which may be determined according to 
the process such as described in Fig. 52. It should be apparent from Fig. 58 
that the allelic penalty imposed for peaks below a ratio of 30% is that such 
peaks will be removed from further measurement and processing. Other penalty 
25 schemes, however, may be imposed in accordance with knowledge about the 
data being processed, as determined by those skilled in the art. 

After the allelic penalty has been determined and applied, the 
standardless genotype processing compares the location of the remaining 
putative peaks to oligonucleotide standards to determine corresponding masses 
30 in the processing for block 524. For standardless genotype data, the processing 
of the block 524 is p rformed to determine mass and genotype, rather than 
performing the operations corresponding to block 424, 426, and 428 of Fig. 33. 
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Techniques for performing such comparisons and determining mass will be 
known to those skilled in the art. 

In another embodiment, the computing device 20 (Fig. 24) permits the 
detection and determination of the mass (location along the x-axis of the data) of 
5 the sense and antisense strand of fragments generated in the assay. If desired, 
the computing device may also detect and determine the quantity (area under 
each peak) of the respective sense and antisense strands, using a similar 
technique to that described above for standardless genotype processing. The 
data generated for each type of strand may then be combined to achieve a data 
10 redundancy and to thereby increase the confidence level of the determined 

genotype. This technique obviates primer peaks that are often observed in data 
from other diagnostic methods, thereby permitting a higher level of multiplexing. 
In addition, when quantitation is used in pooling experiments, the ratio of the 
measured peak areas is more reliably calculated than the peak identifying 
15 technique, due to data redundancy. 

Fig. 23 is a flow diagram that illustrates the processing implemented by 
the computing device 20 to perform sense and antisense processing. In the first 
operation, represented by the flow diagram box numbered 602, the computing 
device receives data from the mass spectrometer. This data will include data for 
20 the sense strand and antisense strand of assay fragments. Next, the height of 
each putative peak in the data sample is determined, as indicated by the block 
604. After the height of each peak in the mass spectrometer data is 
determined, a de-noise process 605 is performed, beginning with an operation 
that extrapolates the noise profile (block 606), followed by finding the noise of 
25 each peak (block 608) and calculating the signal to noise ratio for each data 

sample (block 610). Each of these operations may be performed in accordance 
with the description above for the denoise operations 45 of Fig. 25. Other 
suitable denoise operations will occur to those skilled in the art. The next 
operation is to find the residual error associated with each data point. This is 
30 represented by the block 612 in Figure 36. 

After the residual error for the data of the sense strand and antisense 
strand has been performed, processing to identify the genotypes will be 
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performed for the sense strand and also for the antisense strand. Therefore, Fig. 
23 shows that processing includes sense strand processing (block 630) and 
antisense strand processing (block 640). Each block 630, 640 includes 
processing that corresponds to adjusting the signal to noise ratio, developing a 
5 probability profile, determining an allelic penalty, adjusting the peak probability 
by the allelic penalty, calculating genotype probabilities, and testing genotype 
.probability ratios, such as described above in conjunction with blocks 414 
through 426 of Fig. 56. The processing of each block 630, 640 may, if desired, 
include standardless processing operations such as described above in 

10 conjunction with Fig. 57. The standardless processing may be included in place 
of or in addition to the processing operations of Fig. 56. 

After the genotype probability processing is completed, the data from the 
sense strand and antisense strand processing is combined and compared to 
expected database values to obtain the benefits of data redundancy as between 

1 5 the sense strand and antisense strand. Those skilled in the art will understand 
techniques to take advantage of known data redundancies between a sense 
strand and antisense strand of assay fragments. This processing is represented 
by the block 650. After the data from the two strands is combined for 
processing, the genotype processing is performed (block 660) and the genotype 

20 is identified. 

Since modifications will be apparent to those of skill in this art, it is 
intended that this invention be limited only by the scope of the appended claims. 
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WHAT IS CLAIMED IS: 

1 . A subcollection of samples from a target population, comprising: 
a plurality of samples, wherein the samples are selected from the group 

consisting of blood, tissue, body fluid, cell, seed, microbe, pathogen and 
5 reproductive tissue samples; and 

a symbology on the containers containing the samples, wherein the 
symbology is representative of the source and/or history of each sample, 
wherein: 

the target population is a healthy population that has not been selected 
10 for any disease state; 

the collection comprises samples from the healthy population; and 
the subcollection is obtained by sorting the collection according to 
specified parameters. 

2. The subcollection of claim 1, wherein the parameters are selected 
1 5 from the group consisting of ethnicity, age, gender, height, weight, alcohol 

intake, number of pregnancies, number of live births, vegetarians, type of 
physical activity, state of residence and/or length of residence in a particular 
state, educational level, age of parent at death, cause of parent death, former or 
current smoker, length of time as a smoker, frequency of smoking, occurrence 
20 of a disease in immediate family (parent, siblings, children), use of prescription 
drugs and/or reason therefor, length and/or number of hospital stays and 
exposure to environmental factors. 

3. The subcollection of claim 1, wherein the symbology is a bar code. 

4. A method of producing a database, comprising: 
25 identifying healthy members of a population; 

obtaining data comprising identifying information and obtaining historical 
information and data relating to the identified members of the population and 
their immediate family; 

entering the data into a database for each member of the population and 
30 associating the member and the data with an indexer. 

5. The method of claim 4, further comprising: 
obtaining a body tissue or body fluid sample; 
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analyzing the body tissue or body fluid in the sample; and 
entering the results of the analysis for each member into the database 
and associating each result with the indexer representative of each member. 
6. A database produced by the method of claim 4. 
5 7. A database produced by the method of claim 5. 

8. A database, comprising: 

datapoints representative of a plurality of healthy organisms from 
whom biological samples are obtained, 

wherein each datapoint is associated with data representative of 
10 the organism type and other identifying information. 

9. The database of claim 8, wherein the datapoints are answers to 
questions regarding one or more of a parameters selected from the group 
consisting of ethnicity, age, gender, height, weight, alcohol intake, number of 
pregnancies, number of live births, vegetarians, type of physical activity, state of 

15 residence and/or length of residence in a particular state, educational level, age 
of parent at death, cause of parent death, former or current smoker, length of 
time as a smoker, frequency of smoking, occurrence of a disease in immediate 
family (parent, siblings, children), use of prescription drugs and/or reason 
therefor, length and/or number of hospital stays and exposure to environmental 

20 factors. 

10. The database of claim 9, wherein the organisms are mammals and 
the samples are body fluids or tissues. 

1 1 . The database of claim 9, wherein the samples are selected from 
blood, blood fractions, cells and subcellular organelles. 

25 12 The database of claim 8, further comprising, 

phenotypic data from an organism. 

13. The database of claim 12, wherein the data includes one of physical 
characteristics, background data, medical data, and historical data. 

14. The database of claim 8, further comprising, 

30 genotypic data from nucleic acid obtained from an organism. 
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15. The database of claim 14, wherein genotypic data includes, 
genetic markers, non-coding regions, microsatellites, RFLPs, VNTRs, historical 
data of the organism, medical history, and phenotypic information. 

16. The database of claim 8 that is a relational database. 

5 17. The database of claim 16, wherein the data are related to an 

indexer datapoint representative of each organism from whom data is obtained. 

18. A method of identifying polymorphisms that are candidate genetic 
markers, comprising: 

identifying a polymorphism; and 
10 identifying any pathway or gene linked to the locus of the 

polymorphism, wherein 

the polymorphisms are identified in samples associated with a target 
population that comprises healthy subjects. 

19. The method of claim 18, wherein the polymorphism is identified by 
15 detecting the presence of target nucleic acids in a sample by a method, 

^ comprising the steps of: 

a) hybridizing a first oligonucleotide to the target nucleic acid; 

b) hybridizing a second oligonucleotide to an adjacent region of the 
target nucleic acid; 

20 c) ligating the hybridized oligonucleotides; and 

c) detecting hybridized first oligonucleotide by mass spectrometry as 
an indication of the presence of the target nucleic acid. 

20. The method of claim 18, wherein the polymorphism is identified by 
detecting target nucleic acids in a sample by a method, comprising the steps of: 

25 a) hybridizing a first oligonucleotide to the target nucleic acid and 

hybridizing a second oligonucleotide to an adjacent region of the target nucleic 
acid; 

b) contacting the hybridized first and second oligonucleotides with a 
cleavage enzyme to form a cleavage product; and 
30 c) detecting the cleavage product by mass spectrometry as an 

indication of the pr sence of the target nucleic acid. 
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21. The method of claim 20 wherein the samples are from subjects in 
a healthy database. 

22. The method of claim 18, wherein the polymorphism is identified 
by identifying target nucleic acids in a sample by primer oligo base extension 

5 (probe). 

23. The method of 22, wherein primer oligo base extension, 
comprises: 

a) obtaining a nucleic acid molecule that contains a target nucleotide; 

b) optionally immobilizing the nucleic acid molecule onto a solid support, 
10 to produce an immobilized nucleic acid molecule; 

c) hybridizing the nucleic acid molecule with a primer oligonucleotide that 
is complementary to the nucleic acid molecule at a site adjacent to the target 
nucleotide; 

d) contacting the product of step c) with a composition comprising a 
1 5 dideoxynucleoside triphosphate or a 3'-deoxynucleoside triphosphates and a 

polymerase, so that only a dideoxynucleoside or 3'-deoxynucleoside triphosphate 
that is complementary to the target nucleotide is extended onto the primer; and 

e) detecting the extended primer, thereby identifying the target 
nucleotide. 

20 24. The method of claim 23, wherein detection of the extended primer 

is effected by mass spectrometry, comprising: 

ionizing and volatizrng the product of step d) ; and 

detecting the extended primer by mass spectrometry, thereby identifying 

the target nucleotide. 
25 25. The method of claim 24, wherein; 

samples are presented to the mass spectrometer as arrays on chips; and 
each sample occupies a volume that is about the size of the laser spot 

projected by the laser in a mass spectrometer used in matrix-assisted laser 

desorption/ionization (MALDI) spectrometry. 
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26. A combination, comprising: 

a database containing parameters associated with a datapoint 
representative of a subject from whom samples are obtained, wherein the 
subjects are healthy; and 
5 an indexed collection of the samples, wherein the index identifies the 

subject from whom the sample was obtained. 

27 The combination of claim 26, wherein the parameter is selected 
from the group consisting of ethnicity, age, gender, height, weight, alcohol 
intake, number of pregnancies, number of live births, vegetarians, type of 
1 0 physical activity, state of residence and/or length of residence in a particular 

state, educational level, age of parent at death, cause of parent death, former or 
current smoker, length of time as a smoker, frequency of smoking, occurrence 
of disease in immediate family (parent, siblings, children), use of prescription 
drugs and/or reason therefor, length and/or number of hospital stays and 
1 5 ecposure to environmental factors. 

28. The combination of claim 26, wherein the database further 
contains genotypic data for each subject. 

29. The combination of claim 26, wherein the samples are blood. 
30 A data storage medium, comprising the database of claim 8. 

20 31. A computer system, comprising the database of claim 8. 

32. A system for high throughput processing of biological samples, 
comprising: 

a process line comprising a plurality of processing stations, each of which 

performs a procedure on a biological sample contained in a 
25 reaction vessel; 

a robotic system that transports the reaction vessel from processing 

station to processing station; 
a data analysis system that receives test results of the process line and 

automatically processes the test results to make a determination 
30 regarding the biological sample in the reaction vessel; 

a control system that determines when the t st at each processing 

station is complete and, in response, moves the reaction vess I to 
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the next test station, and continuously processes reaction vessels 
one after another until the control system receives a stop 
instruction; and 

a database of claim 8, wherein the samples tested by the automated 
5 process line comprise samples from subjects in the database. 

33. The system of claim 32, wherein one of the processing stations 
comprises a mass spectrometer. 

34. The system of claim 32, wherein the data analysts system 
processes the test results by receiving test data from the mass spectrometer 

10 such that the test data for a biological sample contains one or more signals, 
whereupon the data analysis system determines the area under the curve of 
each signal and normalizes the results thereof and obtains a substantially 
quantitative result representative of the relative amounts of components in the 
tested sample. 

15 35. A method for high throughput processing of biological samples, 

the method comprising: 

transporting a reaction vessel along a system of claim 32, comprising a 
process line having a plurality of processing stations, each of 
which performs a procedure on one or more biological samples 
20 contained in the reaction vessel; 

determining when the test procedure at each processing station is 

complete and, in response, moving the reaction vessel to the next 
processing station; 
receiving test results of the process line and automatically processing the 
25 test results to make a data analysis determination regarding the 

biological samples in the reaction vessel; and 
processing reaction vessels continuously one after another until receiving 
a stop instruction, wherein the samples tested by the automated 
process line comprise samples from subjects in the database. 
30 36. The method of 35, wherein one of the processing stations 

comprises a mass spectrometer. 
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37. The method of claim 36, wherein the samples are analyzed by a 
method comprising primer oligo base extension (probe). 

38. The method of claim 37, further comprising: 
processing the test results by receiving test data from the mass 

5 spectrometer such that the test data for a biological sample contains one or 
more signals or numerical values representative of signals, whereupon the data 
analysis system determines the area under the curve of each signal and 
normalizes the results thereof and obtains a substantially quantitative result 
representative of the relative amounts of components in the tested sample. 
10 39. The method of claim 37, wherein primer oligo base extension, 

comprises: 

a) obtaining a nucleic acid molecule that contains a target nucleotide; 

b) optionally immobilizing the nucleic acid molecule onto a solid support, 
to produce an immobilized nucleic acid molecule; 

1 5 c) hybridizing the nucleic acid molecule with a primer oligonucleotide that 

is complementary to the nucleic acid molecule at a site adjacent to the target 
nucleotide; 

d) contacting the product of step c) with composition comprising a 
dideoxynucleoside triphosphate or a 3'~deoxynucleoside triphosphates and a 

20 polymerase, so that only a dideoxynucleoside or 3'-deoxynucleoside triphosphate 
that is complementary to the target nucleotide is extended onto the primer; and 

e) detecting the primer, thereby identifying the target nucleotide. 

40. The method of 39, wherein detection of the extended primer is 
effected by mass spectrometry, comprising: 

25 ionizing and volatizing the product of step d); and 

detecting the extended primer by mass spectrometry, thereby identifying 
the target nucleotide. 

41. The method of claim 36, wherein the target nucleic acids in the 
sample are detected and/or identified by a method, comprising the steps of: 

30 a) hybridizing a first oligonucleotide to the target nucleic acid; 

b> hybridizing a second oligonucleotide to an adjacent region of the 
target nucleic acid; 
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c) ligating then hybridized oligonucleotides; and 

c) detecting hybridized first oligonucleotide by mass spectrometry as 
an indication of the presence of the target nucleic acid. 

42. The method of claim 36, wherein the target nucleic acids in the 
5 sample are detected and/or identified by a method, comprising the steps of: 

a) hybridizing a first oligonucleotide to the target nucleic acid and 
hybridizing a second oligonucleotide to an adjacent region of the target nucleic 
acid; 

b) contacting the hybridized first and second oligonucleotides with a 
10 cleavage enzyme to form a cleavage product; and 

c) detecting the cleavage product by mass spectrometry as an 
indication of the presence of the target nucleic acid. 

43. A method of producing a database stored in a computer memory, 
comprising: 

1 5 identifying healthy members of a population; 

obtaining identifying and historical information and data relating to the 
identified members of the population; 

entering the member-related data into the computer memory database for 
each identified member of the population and associating the member and the 
20 data with an indexer. 

44. The method of claim 43, further comprising: 

obtaining a body tissue or body fluid sample of an identified member; 
analyzing the body tissue or body fluid in the sample; and 
entering the results of the analysis for each member into the computer 
25 memory database and associating each result with the indexer representative of 
each member. 

45. A database produced by the method of claim 43. 

46. A database produced by the method of claim 44. 

47. The database of claim 8, wherein: 

30 the organims are selected from among animals, bacteria, fungi, 

protozoans and parasites and 
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each datapoint is associated with parameters representative of the 
organism type and identifying information. 

48. The database of claim 43, further comprising, 
phenotypic data regarding each subject. 
5 49. The database of claim 47 that is a relational database and the 

parameters are the answers to the questions in the questionnaire. 

50. The database of claim 8, further comprising, 

genotypic data of nucleic acid of the subject, wherein genotypic data 
includes, but is not limited to, genetic markers, non-coding regions, 
10 microsatellites, restriction fragment length polymorphisms (RFLPs), variable 
number tandem repeats (VNTRs), historical day of the organism, the medical 
history of the subject, phenotypic information, and other information. 

51. A database, comprising data records stored in computer memory, 
wherein the data records contain information that identifies healthy members of 

15 a population, and also contain identifying and historical information and data 
relating to the identified members. 

52. The database of claim 51, further comprising an index value for 
each identified member that associates each member of the population with the 
identifying and historical information and data. 

20 53. A computer system, comprising the database of claim 51. 

54. An automated process line, comprising the database of claim 51. 

55. A method for determining a polymorphism that correlates with 
age, ethnicity or gender, comprising: 

identifying a polymorphism; and 
25 determining the frequency of the polymorphism with increasing age, with 

ethnicity or with gender in a healthy population. 

56. A method for determining whether a polymorphism correlates with 
suceptibility to morbidity, early mortality, or morbidity and early mortality, 
comprising; 

30 identifying a polymorphism; and 

determining the frequency of the polymorphism with increasing age in a 
healthy population. 
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57. A high throughput method of determining frequencies of genetic 
variations, comprising: 

selecting a healthy target population and a genetic variation to be 
assessed; 

5 pooling a plurality of samples of biopolymers obtained from members of 

the population, 

determining or detecting the biopolymer that comprises the variation by 
mass spectrometry; 

obtaining a mass spectrum or a digital representation thereof; and 
10 determining the frequency of the variation in the population. 

58. The method of claim 57, wherein: 

the variation is selected from the group consisting of an allelic variation, a 
post-translational modification, a nucleic modification, a label, a mass 
modification of a nucleic acid and methylation; and/or 

15 the biopolymer is a nucleic acid, a protein, a polysaccharide, a lipid, a 

small organic metabolite or intermediate, wherein the concentration of 
biopolymer of interest is the same in each of the samples; and/or 

the frequency is determined by assessing the method comprising 
determining the area under the peak in the mass spectrum or digital 

20 repesentation thereof corresponding to the mass of the biopolymer comprising 
the genomic variation. 

59. The method of claim 58, wherein the method for determining the 
frequency is effected by determining the ratio of the signal or the digital 
representation thereof to the total area of the entire mass spectrum, which is 

25 corrected for background. 

60. A method for discovery of a polymorphism in a population, 
comprising: 

sorting the database of claim 8 according to a selected parameter to 
identify samples that match the selected parameter; 
30 isolating a biopolymer from each identified sample; 

optionally pooling each isolated biopolymer; 
optionally amplifying the amount of biopolymer; 
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cleaving the pooled biopolymers to produce fragments thereof; 

obtaining a mass spectrum of the resulting fragments and comparing the 
mass spectrum with a control mass spectrum to identify differences between the 
spectra and thereby identifing any polymorphisms; wherein: 
5 the control mass spectrum is obtained from unsorted samples in the 

collection or samples sorted according to a different parameter. 

61. The method of claim 60, wherein cleaving is effected by contacting 
the biopolymer with an enzyme. 

62. The method of claim 61, wherein the enzyme is selected from the 
10 group consisting of nucleotide glycosylase, a nickase and a type IIS restriction 

enzyme. 

63. The method of claim 60, wherein the biopolymer is a nucleic acid 
or a protein. 

64. The method of claim 60, wherein the the mass spectrometric 
15 format is selected from among Matrix-Assisted Laser Desorption/lonization, 

Time-of-Flight (MALDI-TOF), Electrospray (ES), IR-MALDI, Ion Cyclotron 
Resonance (ICR), Fourier Transform and combinations thereof. 

65. A method for discovery of a polymorphism in a population, 
comprising: 

20 obtaining samples of body tissue or fluid from a plurality of organisms; 

isolating a biopolymer from each sample; 
pooling each isolated biopolymer; 
optionally amplifying the amount of biopolymer; 
cleaving the pooled biopolymers to produce fragments thereof; 
25 obtaining a mass spectrum of the resulting fragments; 

comparing the frequency of each fragment to identify fragments present 
in amounts lower than the average frequency, thereby identifying any 
polymorphisms. 

66. The method of claim 65, wherein cleaving is effected by contacting 
30 the biopolymer with an enzyme. 
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67. The method of claim 66, wherein the enzyme is selected from the 
group consisting of nucleotide glycosylase, a nickase and a type IIS restriction 
enzyme. 

68. The method of claim 65, wherein the biopolymer is a nucleic acid 
5 or a protein. 

69. The method of claim 65, wherein the the mass spectrometric 
format is selected from among Matrix-Assisted Laser Desorption/lonization, 
Time-of-Flight (MALDI-TOF), Electrospray <ES), IR-MALDI, Ion Cyclotron 
Resonance (ICR), Fourier Transform and combinations thereof. 

10 70. The method of claim 65, wherein the samples are obtained from 

healthy subjects. 

71. A method of correlating a polymorphism with a parameter, 
comprising: 

sorting the database of claim 8 according to a selected parameter to 
1 5 identify samples that match the selected parameter; 

isolating a biopolymer from each identified sample; 
pooling each isolated biopolymer; 
optionally amplifying the amount of biopolymer; 
determining the frequency of the polymorphism in the pooled 
20 biopolymers, wherein: 

an alteration of the frequency of the polymorphism compared to a control, 
indicates a correlation of the polymorphism with the selected parameter; and 

the control is the frequency of the polymorphism in pooled biopolymers 
obtained from samples identified from an unsorted database or from a database 
25 sorting according to a different parameter. 

72. The method claim 71, wherein the parameter is selected from the 
group consisting of ethnicity, age, gender, height, weight, alcohol intake, 
number of pregnancies, number of live births, vegetarians, type of physical 
activity, state of residence and/or length of residence in a particular state, 

30 educational level, age of parent at death, cause of parent death, former or 

current smok r, length of time as a smoker, fr quency of smoking, occurrenc 
of a diseas in immediate family (parent, siblings, children), use of prescription 
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drugs and/or reason therefor, length and/or number of hospital stays and 
exposure to environmental factors. 

73. The method claim 72, wherein the parameter is occurrence of 
disease or a particular disease in an immediate family member, thereby 

5 correlating the polymorphism with the disease. 

74. The method of claim 71, wherein the pooled biopolymers are 
pooled nucleic acid molecules. 

75. The method of claim 74, wherein the polymorphism is detected 
by primer oligo base extension (PROBE). 

10 76. The method of 75, wherein primer oligo base extension, 

comprises: 

a) optionally immobilizing the nucleic acid molecules onto a solid support, 
to produce immobilized nucleic acid molecules; 

b) hybridizing the nucleic acid molecules with a primer oligonucleotide 
15 that is complementary to the nucleic acid molecule at a site adjacent to the 

polymorphism; 

c) contacting the product of step c) with composition comprising a 
dideoxynucleoside triphosphate or a 3'-deoxynucleoside triphosphates and a 
polymerase, so that only a dideoxynucleoside or 3'-deoxynucleoside triphosphate 

20 that is complementary to the polymorphism is extended onto the primer; and 

d) detecting the extended primer, thereby detecting the polymorphism in 
nucleic acid molecules in the pooled nucleic acids. 

77. The method of claim 76, wherein detecting is effected by mass 
spectrometry. 

25 78. The method of claim 71, wherein the frequency is percentage of 

nucleic acid molecules in the pooled nucleic acids that contain the 
polymorphism. 

79. The method of claim 78, wherein the ratio is determined by 
obtaining mass spectra of the pooled nucleic acids. 
30 80. The method of claim 72, wherein the parameter is age, thereby 

correlating the polymorphism with suceptibility to morbidity, early mortality or 
morbidity and early mortality. 
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81. A method for haplotyping polymorphisms in a nucleic acid, 
comprising: 

(a) sorting the database of claim 8 according to a selected parameter 
to identify samples that match the selected parameter; 
5 (b) isolating nucleic acid from each identified sample; 

(c) optionally pooling each isolated nucleic acid; 

(d) amplifying the amount of nucleic acid; 

(e) forming single-stranded nucleic acid and splitting each single- 
strand into a separate reaction vessel; 

10 (f) contacting each single-stranded nucleic acid with an adaptor 

nucleic acid to form an adaptor complex; 

(g) contacting the adaptor complex with a nuclease and a ligase; 

(h) contacting the products of step (g) with a mixture that is capable 
of amplifying a ligated adaptor to produce an extended product; 

1 5 (i) obtaining a mass spectrum of each nucleic acid resulting from step 

(h) and detecting a polymorphism by identifying a signal corresponding to the 
extended product; 

(j) repeating steps (f) through (i) utilizing an adaptor nucleic acid able 
to hybridize with another adapter nucleic acid that hybridizes to a different 
20 sequence on the same strand; whereby 

the polymorphisms are haplotyped by detecting more than one extended 
product. 

82. The method of claim 1, wherein the nuclease is Fen-1. 

83. A method for haplotyping polymorphisms in a population, 
25 comprising: 

sorting the database of claim 8 according to a selected parameter to 
identify samples that match the selected parameter; 

isolating a nucleic acid from each identified sample; 
pooling each isolated nucleic acid; 
30 optionally amplifying the amount of nucl ic acid; 

contacting the nucleic acid with at least one enzyme to produce 
fragments thereof; 
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obtaining a mass spectrum of the resulting fragments; whereby: 

the polymorphisms are detected by detecting signals corresponding to the 

polymorphisms; and 

the polymorphisms are haplotyped by determining from the mass 
5 spectrum that the polymorphisms are located on the same strand of the nucleic 

acid. 

84. The method of claim 83, wherein the enzyme is a nickase. 

85. The method of claim 84, wherein the nickase is selected from the 
group consisting of NY2A and NYS1. 

10 86. A method for detecting methylated nucleotides within a nucleic 

acid sample, comprising: 

splitting a nucleic acid sample into separate reaction vessels; 
contacting nucleic acid in one reaction vessel with bisulfite; 
amplifying the nucleic acid in each reaction vessel; 
1 5 cleaving the nucleic acids in each reaction vessel to produce fragments 

thereof; 

obtaining a mass spectrum of the resulting fragments from one reaction 
vessel and another mass spectrum of the resulting f ragements from another 
reaction vessel; whereby: 
20 cytosine methylation is detected by identifying a difference in signals 

between the mass spectra. 

87. The method of claim 86, wherein: 

the step of amplifying is carried out in the presence of uracil; and 
the step of cleaving is effected by a uracil glycosylase. 
25 88. A method for identifying a biological sample, comprising: 

generating a data set indicative of the composition of the biological 
sample; 

denoising the data set to generate denoised data; 

deleting the baseline from the denoised data to generate an intermediate 

30 data 
set; 

defining putative peaks for the biological sample; 
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using the putative peaks to generate a residual baseline; 
removing the residual baseline from the intermediate data set to generate 
a corrected data set; 

locating, responsive to removing the residual baseline, a probable peak in 

5 the 

corrected data set; and 

identifying, using the located probable peak, the biological sample; 
wherein the generated biological sample data set comprises data from 

sense 

10 strands and antisense strands of assay fragments. 

89. The method according to claim 88, wherein identifying includes 
combining 

data from the sense strands and the antisense strands, and comparing the data 
against expected sense strand and antisense strand values, to identify the 
1 5 biological 
sample. 

90. The method according to claim 88, wherein identifying includes 
deriving a peak probability for the probable peak, in accordance with whether the 
probable peak is from sense strand data or from antisense strand data. 

20 91. The method according to claim 88, wherein identifying includes 

deriving a peak probability for the probable peak and applying an allelic penalty in 
response to a 

ratio between a calculated area under the probable peak and a calculated 
expected average area under all peaks in the data set. 
25 92. A method for identifying a biological sample, comprising: 

generating a data set indicative of the composition of the biological 
sample; 

denoising the data set to generate denoised data; 

deleting the baseline from the denoised data to generate an intermediate 

30 data 
set; 
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defining putative peaks for the biological sample; using the 

putative peaks to generate a residual baseline; 

removing the residual baseline from the intermediate data set to generate 

a 

5 corrected data, set; 

locating, responsive to removing the residual baseline, a probable peak in 
the corrected data set; and 

identifying, using the located probable peak, the biological sample; 
wherein identifying includes deriving a peak probability for the probable 
10 peak and 

applying an allelic penalty in response to a ratio between a calculated 
area under the 

probable peak and a calculated expected average area under all peaks in the data 
set. 

15 93. The method according to claim 92, wherein identifying includes 

comparing 

data from probable peaks that did not receive an applied allelic penalty to 
determine their mass in accordance with oligonucleotide biological data. 

94. The method according to claim 92, wherein the allelic penalty is 
20 not applied to probable peaks whose ratio of area under the peak to the 

expected area value is greater than 30%. 

95. A method for detecting a polymorphism in a nucleic acid, 
comprising: 

amplifying a region of the nucleic acid to produce an amplicon, wherein 
25 the resulting amplicon comprises one or more enzyme restriction sites; 

contacting the amplicon with a restriction enzyme to produce fragments; 
obtaining a mass spectrum of the resulting fragments and analyzing 
signals in the mass spectrum by the method of claim 88; whereby: 
the polymorphism is detected from the pattern of the signals. 
30 96. A subcollection of samples from a target population, comprising: 

a plurality of samples, wherein the samples are selected from the group 
consisting of nucleic acids, fetal tissue, protein samples; and 
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a symbology on the containers containing the samples, wherein the 
symbology is representative of the source and/or history of each sample, 
wherein: 

the target population is a healthy population that has not been selected 
5 for any disease state; 

the collection comprises samples from the healthy population; and 
the subcoltection is obtained by sorting the collection according to 
specified parameters. v 

97. The combination of claim 26, wherein the samples are selected 
10 selected from the group consisting of nucleic acids, fetal tissue, protein, tissue, 

body fluid, cell, seed, microbe, pathogen and reproductive tissue samples. 

98. A combination, comprising the database of claim 8 and a mass 
spectrometer. 

99. The combination of claim 98 that is an automated process line for 
15 analyzing biological samples. 

100. A system for high throughput processing of biological samples, 
comprising: 

a database of claim 8, wherein the samples tested by the automated 
process line comprise samples from subjects in the database; and 
20 a mass spectrometry for analysis of biopolymers in the samples. 
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Questionnaire for 
Population-Based 
Sample Banking 

Data Collection Form 

Collection Information 

Consent Form Signed Yes No 

Date of Collection (MM/DD/YY)_/_/98 

Time of Sample Collection(nearest hour in 24 hour dock format) I 

Initials of Oata Collector Collecting Agency Affix Barcode Here 

(DO NOT COMPLETE: (For Date Entry Only)Sample intact lost broken 

Donor information 

Sex: □ Male □ Female Date of Birth (MM/YY)l_/_ 

In which state do you live? How long hove you lived there ? Years 

What is your highest grade you completed in school? 

□ less than 8th grade O 8th,9th,10th or 11th grade □ high school graduate or equivalency 

□ some college 2 yr degree □ college graduate 4 yr degree □ post graduate education or degree 
To the best of your knowledge what is the Ethnic Origin of your 



Father 


Mother 




D 


□ 


Caucasian (please check specific geographic area below if known) 


D 


□ 


Northern Europe (Austria,Denmark t r1nIand ( France,Germany 1 Netheriands ( Norway f Sweden l Swit2er1ond,U.K.) 


□ 


a 


Southern Europe (Greecejtaly, Spain) 


□ 


□ 


Eastern Europe (Czechoslovakia, Hungary,Poland,Russia,Yugoslavia) 


a 


□ 


Middle Eastern (Israel.EgyptJran t IraqJordan,Syria, other Arab States) 


□ 


a 


African-American 


□ 


a 


Hispanic (please check specific geographic area below if known) 


□ 


□ 


Mexico 


□ 


a 


Central America, South American 


□ 


□ 


Cuba.Puerto Rico, other Caribbean 


D 


D 


Asian (please check specific geographic area below if known) 


□ 


□ 


Japanese 


□ 


□ 


Chinese 


□ 


a 


Korean 


□ 


□ 


Vietnamese 


□ 


□ 


other Asian 


□ 


□ 


Other 


□ 


D 


Don't know 



Health information: Hove you or has anyone in your immediate family(parents,brothers,sisters, or your children) 
had the following? Check all that apply 

Disease: You Mother Father Sister Brother Child 

Heart Disease Stroke or Arteriosclerosis 

Cancer (Specify type if known) 

Alzheimer's Disease or Dementia 

Chronic inflammatory or Autoimmune Disease 
Nervous System Disease like Multiple Sclerosis 
Other (please specify) 

Additional health information details you would like to provide: 
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Collection Information 



Consent Form Signed 
CD Yea CD No 



Date of Coltectioon 


Month 


Oov 


Year 




r 


21 010 1 


JAN cd 
FEB cd 
MAR cd 
APR cd 
MAY (=1 
JUN cd 
JUL CD 
AUG CD 
SEP cd 
OCT CD 
NOV CD 
DEC CD 


t T 11 1 11 1 p i t n i 1 

CDCP^ C23 CJ3CD 
1 All $ 1CSCPCPCT 
CJ3 0& ODTDD 

CO QD S3 
CDfDDCDi 7 n 7*1 
ryprwif if w yi 
ro/vtp u, ^"tyi 



Time of Sample 
, Collection 
(nearest hour, in 
24 hour dock 
format) 



Initials 



rftirrrt rp*irgi 
^*^ff]"p rrnrrj i 

rcinnDDro 

33 CD 33 CE 

33 acmes; 

OQD33CS 

□PI Z II iocs 



Initials of Data Collector 



(DO NOT COMPLETE; 
for doto entry only) 


Sampte; 
CDlntoct 
CD Lost 
CD Broken 


Volume 
{ml) 






fTMf|p 

mm 
race 

C33CQ 
CDC53 



Donor Information 



Date of Birth 


Month 


Year 




1|9| 1 


JAN CD 
FEB CD 
MAR CD 
APR CD 
MAY CD 
JUN CD 
JUL CD 
AUG CD 
SEP CD 
OCT CD 
NOV a 
DEC CD 


1 'rrr** p fc **u^ 

MCDCDCD 

GDCECOCC 
mi51L53CO 
rRinsnrKiryt 

rnrncacD 



Sex: 



I Female 



Height 


Ft 


Inches 








cd 

CS2 
CBD 
CD 


rmrri 
D"iori 

D£CS3 
DOrfS 

"Tirni 

3D 



Weight 
(lb) 



cd mm CD 
cdcdctd 

CD CD CD 



f *5 ir*^ ir**yi 
r*Ti f ? ii *7 i 



What Physical 
activity do you do 
on a regular basis? 



Running 

Swimming 

Biking 

Gymnastics 

Other 

None 



Are you o 
vegetarian? 

CD Yes 
CD) No 



If Female; 

How many 
times hove [ 
you been 
pregnant? 



How many 
times did 
you give 
birth? 



n which 
(tote do 
ou live? 



To the best of your knowledge, what is the Ethnic Origin of your 
Mother 



Father 
CD 
CD 
CD 



CD 
CD 



CD 
CD 
CD 
CD 



Caucasian (please mark specific geographic area below if known) 
Northern Europe (Austria, Denmark, Finland, France, Germany, Netherlands, Norway, Sweden. Switzerland, UK) 
Southern Europe (Greece. Italy, Spain. Turkey) 
Eastern Europe (Czechoslovakia, Hungary, Poland, Russia, Yugoslavia) 
Middle Eastern (Israel, Egypt, Iran, Iraq, Jordon. Syria, Other Arab States) 

African-American 

Hispanic (please mark specific geographic areo below if known) 
Mexico 

Central America, South America 
Cuba, Puerto Rico, other Caribbean 

Asian (please mark specific geographic areo below if known) 

Japanese 

Chinese 

Korean 

Vietnamese 

Filipino 

Native American 



Other 



CD Don't know 



J How long 
1 have you 
J lived 
] there? 



DDr£3 



What is your highest grade Mother Deceased? Cause of Death Mother Eothftr Deceased? Cause of Death Father 

you completed in school? mv« I 1 CD Yes 

Heart Disease CD No 



CD less then 8th grade 
8th,9th,10th,or11th grade 
high school graduate or 

equivalency 
some college. 2yr degree 
CD college graduote,4yr degree 
post graduate education or 
degree 



3 Yes 
3 No 



If Yes at 
what age? 



40-49 

50-59 
60-69 
70-79 
80-89 
> 90 



£ 29 

30-39 CD Cancer 
Stroke 
Accident 
Suicide 
Other, 



If Yes at 
what age? 



< 29 
30-39 
40-49 
50-59 
60-69 
70-79 
80-69 
> 3P 



CD Heart Disease 
CD Cancer 
CD Stroke 
CD Accident 
CD Suicide 
CD Other. 



FIG. 22A 



JSDOCID <WO 0127PF7A? ! > 



SUBSTITUTE SHEET (RULE 26) 



WO 01/27857 



29/51 



PCT/US00/28413 



Have you ever smoked? 
If yes. for how long? 



mm 
cdcd 



Yes CD No 
Years 



CS3CS3 



mm 

I H ft 1 



Have you been hospitalized 
in the past 5 years for more 
then 6 days at a time? 
CD Yes a No 

If yes, how many times? 

r I 1 1 V *** "ft i rmr^iryir^*) 



For each hospitalization 
(rf not the same) 
how long did you stay *" 
ond for what reason? 



1) Weeks: m rnrTi mcS J Cfij 

CD Acute disorder, including infection ond thrombosis 

m Chronic Disorder 

CD Accident 

a Other. 

2) Weeks: mrnrHODgJCD 

m Acute disorder, including infection and thrombosis 
' m Chronic Disorder 
m Accident 
m Other 

3) Weeks: CXDCDCQrJCfflCED 

m Acute disorder, including infection and thrombosis 

CD Chronic Disorder 

m Acddent 

m Other. 



Have you or has anyone 
Mark all that apply! 
Disease 



your immediate family (parents ( brother3,st5ters,or your children) had the following? 



Heart Disease, including arteriosclerosis 



Stroke 



Hypertension 



Blood dob 

Diabetes, insulin dependent ~ 
Diabetes, not insulin-dependent (diet controlled) 



Can cer 

Lunq&Bronchus " 



Breasts 



Prostate 



Colon&Rectum 
"Sail 



Lymphomo&Leukemio 



Other, please specify below: 



Alzheimer's Disease 



Epilepsy 
Schizophrenic 



Bipolar disorder (manic depression) 



Major depression ^ 

Chronic Inflammatory or Autoimmune Disease including 
Multiple Sclerosis ond Rheumatoid Arthritis 



Emphysema" 



Asthma 



Other, please specify below: 



You 


Mother 


Father 


Sister 


Brother 


Child 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


cd 


CD 


CD 


CD 


CD 


CD 


cd 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 




CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 




CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 



Do you take prescription drugs on a regular basis? 



If yes, please specify below: 



CD Yes CD No 



Hove you ever donated blood before? CD Yes CD No Additional health information details you would like to provide: 
If yes, how many times: Number of Times 



r flirn irm 
LULUCD 

mmcp 

t * i i * i n i 
i ft » w irtn 



COCDCO 



Do you drink any kind of alcoholic beveroge? 

CD Never CD Hardly ever 

CD Less than 3 times per week CD 3 or more times per week 

CDDaOy 



\'SDOC!p <WC ^^77^7^? i > 



FIG. 22B 

SUBSTITUTE SHEET (RULE 26) 



FOR 
OFFICE 



USE 


ONLY 






CO 


ca 


CD 


C8D 


C53 


USD 


£33 





WO 01/27857 



PCT/US00/28413 



30/51 



l Co^acbon Infonmotion 



Consent Form Signed 
CD Yes CZJ No 



Data of Collection 




Yeor 


wontr- | Dcy_ 


2|0|0| 




I I 1 II 1 IMI 7 In II "I 
t ^ H I [ ^ 11 *t 1 1 ^ ii ,^ T 
f ^ ll il t f '-^ n A 1 1 ^ ti .A 1 
r ^ w ^ i f ^ ii ^ if ^ ii ^ i 
^ ff ii A » i 
i f ti T i r^"trT"i 
f fl it n*i t ii ff i 
r t> it g irt>ni n j 



Time of Sample 
Collection 

format J 



UXJUt 

men 
rnrr 



mcs: 
mm 



Initials 







f p Ifff ll p II I 

f^^jr^^r^^u p > 
pff^f^ ^i "TP 

mm mm 
mi s ilejlsj 

I H II tTIi H if^f ^ 

f y - ir^n r pir^ j p i 
li fl' if Wi f .1 w Wl 

mm mm 
mm mm 
aomapm 



Initials of Data Collector 



(00 NOT COMPLETE 
for data entry only) 


Sample: 
m Intact 
a Lost 

CD Broken 


Volume 
(ml) 




i n ii m 

mm 


mm 
mm 

mm 



[Donor |rlu>*mfaon 




Sex: 

mMale 
m Female 



•^r i t j i i n 
•y lift i 



Height 



Ft 



Inches 



*y i ~s »i rn 



m 



Weight 
(lb) 



mmm 



rmrmm 



What physical Are you a 
activity do you do vegetarian? 
on a regular basis? 

CD Yes 

CD Running CD No 

CD Swimming 

CD Biking 

CD Gymnastics 

CD Other 

m None 



ff female: 

How many 
times haverrri 
you been m 
pregnant? m 

m 
m 
m 

m 
m 
> m 



How many 
times aid 
you give 
birth? 



cm 
m 
m 
m 
m 
m 
m 
m 
m 



In etMch state 
do you ta*7 



To the best of your knowledge, 
Father Mother 



what is the Ethnic Origin of your 



Caucasian (please mark specific geographic area below if known) 
Northern Europe (Austria. Denmark, Finland, France, Germony, Netherlands, Norway. Sweden, Switzerland. UK) 
Southern Europe (Greece. Italy, Spain. Turkey) 
Eastern Europe(Czechoslovakia. Hungary, Poland, Russia. Yugoslavia) 
Middle Eastern (Israel. Egypt Iran. Iraq, Jordan. Syria. Other Arab States) 
Other 

Don't know 



How long 
have you 
Gved 
there? 



Tears 



r^DCD3 

mm 
mm 
im 

ICS 

mm 
mm 

rwjiry i 



How many 
years have you 
been smoking? 

I Years J 



Did you quit 
smoxing? 

a Yes 
□ No 



If yes, how How many cigarettes 

many years do/did you smoke per 

ogof day? 

Tears" 



If yes, for 
how long? 



FIG. 22C 



Years 



rrjnrrji rrprrjn 

mm 
mm 

f ir^*i 1 1 if*y i 



mm 



Do you have lung 
Emphysema? 



Years 



3 Yes 
3 No 



f rj irnn 

mm 
mm 
mm 

i inf~i 

mm 



mm 



Continue 
on back 
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What is your highest grade 
you completed in school? 

□ less then 8th grade 

□ 61h.9th.10th.oM 1th grade 
CD high school graduate or 

equivalency 

□ some college, 2yr degree 

□ college graduate,4yr degree 

□ post graduate education or 

degree 



Mother Deceased? Cause of Death Mother Father Deceased? Cause of Death Father 



□ Yes 

□ No 

If Yes at 
what age? 



< 29 
30-39 
40-49 
50-59 
60-69 
70-79 
80-89 

> 90 



3 Heart Dbecse 
3 Cancer 
3 Stroke 
□ Accident 
3 Suicide 
3 Other. 



□ Yes 

□ No 

If Yes at 
what age? 



< 29 
30-39 
40-49 



Heart Disease 
a Cancer 
Stroke 



60-69 
70-79 
80-69 
> 90 



50-59 □Accident 



3 Suicide 
□ Other, 



(Health Information 



Have you or has anyone 
Hark alt that oppiyi 
Disease 



your immediate family ( paren ts. brothers . si stereo r your children) had the following? 



Heart Disease _ 



Stroke 



frood dots" 



D wbetes. insulin dependent 
Capetes. not insulin -dependent 



Uxig&BronchuT" 



Breasts 



Ft ua tote 



Coton&Rectum 
Sun 



LywiphomaALeukemio 



Other, please specify below: 
Air**c*t*er*s Disease 



Schi?ocrircnto 



hpokr disorder (manic depression J 



Hojor oc presston 

Chronic Lnnammatory or Autoimmune Disease including 
Uuftote Sderosis and Rheumatoid Arthritis 



Asu itnu 



Other, please specify below: 



You 


Mother 


Father 


Sister 


Brother 


Child 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 




□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 


□ 




□ 


□ 


□ 


□ 


□ 


□ 



Do you take prescription drugs on a regular basts? 



If yes, please specify below: 



□ Yes 



)No 



Have you ever donated □ Yes □ No 
blood before? 



Hove you been hospitalized 
in the past 5 years for more 
then 6 days at a time? 
□ Yes □ No 

If yes. how many times? 
f yiryv ^ if it ft it ^ ir^ irtyi 



For each hospitalization 
(if not the same) 
how long did you stay 
and for what reason? 



1) Weeks: CX3CZ1C33CS3CSDCS 

□ Acute disorder, including infection and thrombosis 

□ Chronic Disorder 

□ Accident 

□ Other. 

2) Weeks: CD03C33GDE33m 

□ Acute disorder, including infection ond thrombosis 

□ Chronic Disorder 

□ Accident 

□ Other. 

3) Weeks: mmrS3C£jr53CE) 

□ Acute disorder, including infection and thrombosis 

□ Chronic Disorder 

□ Accident 

□ Other 



If yes, how 
many times: 



Number of Times 



n n i i m 

i .1 ir^-ii \ 



Do you drink any kind of alcoholic beverage? 

□ Never □Hardly ever 

□ Less than 3 times per week a3 or more times per week 

□ Daily 

Additional health information details you would like to provide: 
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SEQUENCE LISTING 

<110> SEQUENOM 

Braun et al . 

<12 0> METHODS FOR GENERATING DATABASES AND DATABASES FOR IDENTIFYING 
POLYMORPHIC GENETIC MARKERS 



<130> 24736-2033PC 

<140> Not Yet Assigned 
<141> 2000-10-13 

<150> 60/217,658 
<151> 2000-07-10 

<150> 60/159,176 
<151> 1999-10-13 

<150> 60/217,251 
<151> 2000-07-10 

<150> 09/663, 968 
<151> 2000-09-19 

<160> 118 

<170> FastSEQ for Windows Version 4.0 

<210> 1 

<211> 361 

<212> DNA 

<213> Homo Sapien 



<400> 1 

ctgaggacct ggtcctctga ctgctctttt cacccatcta cagtccccct tgccgtccca 60 

agcaatggat gatttgatgc tgtccccgga cgatattgaa caatggttca ctgaagaccc 12 0 

aggtccagat gaagctccca gaatgccaga ggctgctccc cgcgtggccc ctgcaccagc 180 

agctcctaca ccggcggccc ctgcaccagc cccctcctgg cccctgtcat cttctgtccc 240 

ttcccagaaa acctaccagg gcagctacgg tttccgtctg ggcttcttgc attctgggac 300 

agccaagtct gtgacttgca cggtcagttg ccctgagggg ctggcttcca tgagacttca 360 



a 361 

<210> 2 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide Primer 
<400> 2 

cccagtcacg acgttgtaaa acgctgagga cctggtcctc tgac 44 

<210> 3 
<211> 42 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide Primer 
<400> 3 

agcggataac aatttcacac aggttgaagt ctcatggaag cc 42 

<210> 4 
<211> 17 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> Probe 
<400> 4 

gccagaggct gctcccc 17 

<210> 5 
<211> 17 
<212> DNA 

<213> Artificial Sequence 



<220> 

<22 3> Probe 



<400> 5 

gccagaggct gctcccc 17 

<210> 6 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Probe 
<400> 6 

gccagaggct gctccccgc 19 

<210> 7 
<211> 18 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Probe 



<400> 7 

gccagaggct gctccccc 18 

<210> 8 

<211> 161 

<212> DNA 

<213> Homo Sapien 



<400> 8 

gtccgtcaga acccatgcgg cagcaaggcc tgccgccgcc tcttcggccc agtggacagc 60 

gagcagctga gccgcgactg tgatgcgcta atggcgggct gcatccagga ggcccgtgag 120 

cgatggaact tcgactttgt caccgagaca ccactggagg g 161 

<210> 9 
<211> 43 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide Primer 
<400> 9 
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cccagtcacg acgttgtaaa acggtccgtc agaacccatg egg 43 

< 2 1 0 . 10 

< 2 1 1 .- 4 4 
•212> DNA 

«*213> Artificial Sequence 
<220> 

*.22 3 > Oligonucleotide Primer 
-.400* 10 

u^cgnutaac aatttcacac aggctccagt ggtgtctcgg tgac 44 

^210 ^ : l 

< 2 1 : * : s 

< 2 1 2 . DMA 

<213* Artificial Sequence 
<220 - 

<223 • Oligonucleotide Primer 
<400 . 1 1 

caycinrjg ctgag 15 

<210 * 12 
<211.. IS 
<212 > UNA 

<213> Artificial Sequence 
<220> 

<223> Probe 
<400> 12 

cagegagcag ctgag 15 

<210> 13 

<211> 16 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Probe 

<400> 13 

cagegagcag ctgagc i6 

<210> 14 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Probe 
<400> 14 

cagegagcag ctgagac 17 

<210> 15 
<211> 205 
<212> DNA 
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<213> Homo 



Sapien 



<400> 15 



gcgctccatt 
caggtgcagt 
ctgcaacaat 
cctgaagact 



catctcttca tcgactctct gttgaatgaa gaaaatccaa gtaaggccta 
tccaaggaag cctttgagaa agggctctgc ttgagttgta gaaagaaccg 
ctgggctatg agatcaataa agtcagagcc aaaagaagca gcaaaatgta 
cgttctcaga tgccc 



60 
120 
180 
205 



<210> 16 
<211> 42 
<212> DNA 



<213> Artificial Sequence 



<220> 

«.22 3> Oligonucleotide Primers 



*400> 16 

cccagtcacg acgttgtaaa acggcgctcc attcatctct tc 



42 



c210> 17 
< 21 1 > 42 
c212> DNA 



*ri3> Artificial Sequence 



<220> 

<223> Oligonucleotide Primer 



<400> 17 

agcggataac aatttcacac agggggcatc tgagaacgag tc 



42 



<210> 18 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide Primer 
<400> 18 

caatctgggc tatgagatca 20 

<210> 19 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Probe 
<400> 19 

caatctgggc tatgagatca 20 

<210> 20 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Probe 
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<400> 20 

caatctgggc tatgagatca a 21 

<210> 21 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> Probe 
<400> 21 

caatctgggc tatgagatca gt 20 

<210> 22 

<211> 60 

<212> DNA 

<213> Homo Sapien 

<220> 

<223> Probe 
<400> 22 

gtgccggcta ctcggatggc agcaaggact cctgcaaggg ggacagtgga ggcccacatg 60 

<210> 23 

<211> 60 

<212> DNA 

<213> Homo sapien 

<400> 23 

ccacccacta ccggggcacg tggtacctga cgggcatcgt cagctggggc cagggctgcg 60 

<210> 24 
<211> 42 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 24 

cccagtcacg acgttgtaaa acgatggcag caaggactcc tg 42 

<210> 25 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 25 

cacatgccac ccactacc 18 

<210> 26 
<211> 43 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Oligonucleotide primer 
<400> 26 

agcggataac aatttcacac aggtgacgat gcccgtcagg tac 4 3 

<210> 27 
<211> 15 
<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3- Probe 



<400> 27 

atgccaccca ctacc 15 

<210> 28 

<211> 19 

<212> DNA . 

<213> Artificial Sequence 
<220> 

<223> Probe 

<400> 28 

cacatgccac ccactaccg 19 

<210> 29 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Probe 
<400> 29 

cacatgccac ccactaccag 20 

<210> 30 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Probe 



<400> 30 

agcggataac aatttcacac agg 23 

<210> 31 

<211> 2363 

<212> DNA 

<213> Homo Sapien 

<220> 
<221> CDS 

<222> (138) . . . (2126) 
<223> AKAP-10 
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<300> 

<308> GenBank AF037439 
<309> 1997-12-21 

<400> 31 

gcggcttgtt gataatatgg cggctggagc tgcctgggca tcccgaggag gcggtggggc 60 
ccactcccgg aagaagggtc ccttttcgcg ctagtgcagc ggcccctctg gacccggaag 120 
tccgggccgg ttgctga atg agg gga gcc ggg ccc tec ccg cgc cag tec 170 

Met Arg Gly Ala Gly Pro Ser Pro Arg Gin Ser 
1 5 10 

ccc cgc acc etc cgt ccc gac ccg ggc ccc gcc atg tec ttc ttc egg 218 
Pro Arg Thr Leu Arg Pro Asp Pro Gly Pro Ala Met Ser Phe Phe Arg 
15 20 25 

egg aaa gtg aaa ggc aaa gaa caa gag aag acc tea gat gtg aag tec 266 
Arg Lys Val Lys Gly Lys Glu.Gln Glu Lys Thr Ser Asp Val Lys Ser 
30 35 40 

at: aaa get tea ata tec gta cat tec cca caa aaa age act aaa aat 314 
lie Lys Ala Ser lie Ser Val His Ser Pro Gin Lys Ser Thr Lys Asn 
45 50 55 

cat gcc ttg ctg gag get gca gga cca agt cat gtt gca ate aat gcc 362 
Hie Ala Leu Leu Glu Ala Ala Gly Pro Ser His Val Ala lie Asn Ala 
60 65 70 75 

att tct gcc aac atg gac tec ttt tea agt age agg aca gcc aca ctt 410 
lie Ser Ala Asn Met Asp Ser Phe Ser Ser Ser Arg Thr Ala Thr Leu 
80 85 90 

aag aag cag cca age cac atg gag get get cat ttt ggt gac ctg ggc 4 58 

Lys Lys Gin Pro Ser His Met Glu Ala Ala His Phe Gly Asp Leu Gly 
95 100 105 

aga tct tgt ctg gac tac cag act caa gag acc aaa tea age ctt tct 506 
Arg Ser Cys Leu Asp Tyr Gin Thr Gin Glu Thr Lys Ser Ser Leu Ser 
110 115 120 

aag acc ctt gaa caa gtc ttg cac gac act att gtc etc cct tac ttc 554 
Lys Thr Leu Glu Gin Val Leu His Asp Thr lie Val Leu Pro Tyr Phe 
125 130 135 

att caa ttc atg gaa ctt egg cga atg gag cat ttg gtg aaa ttt tgg 602 
lie Gin Phe Met Glu Leu Arg Arg Met Glu His Leu Val Lys Phe Trp 
140 145 150 155 

tta gag get gaa agt ttt cat tea aca act tgg teg cga ata aga gca 650 
Leu Glu Ala Glu Ser Phe His Ser Thr Thr Trp Ser Arg lie Arg Ala 
160 165 170 

cac agt eta aac aca atg aag cag age tea ctg get gag cct gtc tct 698 
His Ser Leu Asn Thr Met Lys Gin Ser Ser Leu Ala Glu Pro Val Ser 
175 180 185 

cca tct aaa aag cat gaa act aca gcg tct ttt tta act gat tct ctt 746 
Pro Ser Lys Lys His Glu Thr Thr Ala Ser Phe Leu Thr Asp Ser Leu 
190 195 200 

gat aag aga ttg gag gat tct ggc tea gca cag ttg ttt atg act cat 794 
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Asp Lys Arg Leu Glu Asp Ser Gly Ser Ala Gin Leu Phe Met Thr His 
205 210 215 

tea gaa gga att gac ctg aat aat aga act aac age act cag aat cac 842 
Ser Glu Gly lie Asp Leu Asn Asn Arg Thr Asn Ser Thr Gin Asn His 
220 225 230 235 

ttg ctg ctt tec cag gaa tgt gac agt gee cat tct etc cgt ctt gaa 890 
Leu Leu Leu Ser Gin Glu Cys Asp Ser Ala His Ser Leu Arg Leu Glu 
240 245 250 

atg gec aga gca gga act cac caa gtt tec atg gaa acc caa gaa tct 93 8 

Met Ala Arg Ala Gly Thr His Gin Val Ser Met Glu Thr Gin Glu Ser 
255 260 265 

tec tct aca ctt aca gta gec agt aga aat agt ccc get tct cca eta 986 
Ser Ser Thr Leu Thr Val Ala Ser Arg Asn Ser Pro Ala Ser Pro Leu 
270 275 280 

aaa gaa ttg tea gga aaa eta atg aaa agt ata gaa caa gat gca gtg 1034 
Lys Glu Leu Ser Gly Lys Leu Met Lys Ser lie Glu Gin Asp Ala Val 
285 290 295 

aat act ttt acc aaa tat ata tct cca gat get get aaa cca ata cca 1082 
Asn Thr Phe Thr Lys Tyr lie Ser Pro Asp Ala Ala Lys Pro lie Pro 
300 305 310 315 

att aca gaa gca atg aga aat gac ate ata gca agg att tgt gga gaa 1130 
lie Thr Glu Ala Met Arg Asn Asp lie lie Ala Arg lie Cys Gly Glu 
320 325 330 

gat gga cag gtg gat ccc aac tgt ttc gtt ttg gca cag tec ata gtc 1178 
Asp Gly Gin Val Asp Pro Asn Cys Phe Val Leu Ala Gin Ser lie Val 
335 340 345 

ttt agt gca atg gag caa gag cac ttt agt gag ttt ctg cga agt cac 1226 
Phe Ser Ala Met Glu Gin Glu His Phe Ser Glu Phe Leu Arg Ser His 
350 355 360 

cat ttc tgt aaa tac cag att gaa gtg ctg acc agt gga act gtt tac 1274 
His Phe Cys Lys Tyr Gin lie Glu Val Leu Thr Ser Gly Thr Val Tyr 
365 370 375 

ctg get gac att etc ttc tgt gag tea gee etc ttt tat ttc tct gag 1322 
Leu Ala Asp lie Leu Phe Cys Glu Ser Ala Leu Phe Tyr Phe Ser Glu 
380 385 390 395 

tac atg gaa aaa gag gat gca gtg aat ate tta caa ttc tgg ttg gca 1370 
Tyr Met Glu Lys Glu Asp Ala Val Asn lie Leu Gin Phe Trp Leu Ala 
400 405 410 

gca gat aac ttc cag tct cag ctt get gee aaa aag ggg caa tat gat 1418 
Ala Asp Asn Phe Gin Ser Gin Leu Ala Ala Lys Lys Gly Gin Tyr Asp 
415 420 425 

gga cag gag gca cag aat gat gee atg att tta tat gac aag tac ttc 1466 
Gly Gin Glu Ala Gin Asn Asp Ala Met lie Leu Tyr Asp Lys Tyr Phe 
430 435 440 

tec etc caa gee aca cat cct ctt gga ttt gat gat gtt gta cga tta 1514 
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Ser Leu Gin Ala Thr His Pro Leu Gly Phe Asp Asp Val Val Arg Leu 
445 450 455 

gaa act gaa tec aat ate tgc agg gaa ggt ggg cca etc ccc aac tgt 1562 
Glu lie Glu Ser Asn lie Cys Arg Glu Gly Gly Pro Leu Pro Asn Cys 
460 465 470 475 

ttc aca act cca tta cgt cag gec tgg aca acc atg gag aag gtc ttt 1610 
Phe Thr Thr Pro Leu Arg Gin Ala Trp Thr Thr Met Glu Lys Val Phe 
480 485 490 

ttg cct ggc ttt ctg tec age aat ctt tat tat aaa tat ttg aat gat 1658 
Leu Pro Gly Phe Leu Ser Ser Asn Leu Tyr Tyr Lys Tyr Leu Asn Asp 
495 500 505 

etc ate cat teg gtt cga gga gat gaa ttt ctg ggc ggg aac gtg teg 1706 
Leu He Hie Ser Val Arg Gly Asp Glu Phe Leu Gly Gly Asn Val Ser 
510 515 520 

ccg act get cct ggc tct gtt ggc cct cct gat gag tct cac cca ggg 1754 
Pro Thr Ala Pro Gly Ser Val Gly Pro Pro Asp Glu Ser His Pro Gly 
525 530 535 

agt tct gac age tct gcg tct cag tec agt gtg aaa aaa gec agt att 1802 
Ser Ser Asp Ser Ser Ala Ser Gin Ser Ser Val Lys Lys Ala Ser He 
540 545 550 555 

aaa ata ctg aaa aat ttt gat gaa gcg ata att gtg gat gcg gca agt 1850 
Lys He Leu Lys Asn Phe Asp Glu Ala He He Val Asp Ala Ala Ser 
560 565 570 

ctg gat cca gaa tct tta tat caa egg aca tat gec ggg aag atg aca 1898 
Leu Asp Pro Glu Ser Leu Tyr Gin Arg Thr Tyr Ala Gly Lys Met Thr 
575 580 585 

ttt gga aga gtg agt gac ttg ggg caa ttc ate egg gaa tct gag cct 1946 
Phe Gly Arg Val Ser Asp Leu Gly Gin Phe He Arg Glu Ser Glu Pro 
590 595 600 

gaa cct gat gta agg aaa tea aaa gga tec atg ttc tea caa get atg 1994 
Glu Pro Asp Val Arg Lys Ser Lys Gly Ser Met Phe Ser Gin Ala Met 
605 610 615 

aag aaa tgg gtg caa gga aat act gat gag gee cag gaa gag eta get 2 042 

Lys Lys Trp Val Gin Gly Asn Thr Asp Glu Ala Gin Glu Glu Leu Ala 
620 625 630 635 

tgg aag att get aaa atg ata gtc agt gac att atg cag cag get cag 2090 
Trp Lys He Ala Lys Met He Val Ser Asp He Met Gin Gin Ala Gin 
640 645 650 

tat gat caa ccg tta gag aaa tct aca aag tta tga ctcaaaactt 2136 
Tyr Asp Gin Pro Leu Glu Lys Ser Thr Lys Leu * 
655 660 

gagataaagg aaatctgett gtgaaaaata agagaacttt tttcccttgg ttggattctt 2196 

caacacagcc aatgaaaaca gcactatatt tctgatctgt cactgttgtt tccagggaga 2256 

gaatggggag acaatcctag gacttccacc etaatgeagt tacctgtagg gcataattgg 2316 

atggcacatg atgtttcaca cagtgaggag tctttaaagg ttaccaa 2363 
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<210> 32 

<211> 662 

<212> PRT 

<213> Homo Sapien 



<400> 32 
Met Arg Gly Ala 
1 

Pro Asp Pro Gly 
20 

Lys Glu Gin Glu 
35 

Ser Val His Ser 
50 

Ala Ala Gly Pro 
65 

Asp Ser Phe Ser 

His Met Glu Ala 
100 

Tyr Gin Thr Gin 
115 

Val Leu His Asp 
130 

Leu Arg Arg Met 
145 

Phe His Ser Thr 

Met Lys Gin Ser 
180 

Glu Thr Thr Ala 
195 

Asp Ser Gly Ser 
210 

Leu Asn Asn Arg 
225 

Glu Cys Asp Ser 

Thr His Gin Val 
260 

Val Ala Ser Arg 
275 

Lys Leu Met Lys 
290 

Tyr lie Ser Pro 
305 

Arg Asn Asp lie 

Pro Asn Cys Phe 
340 

Gin Glu His Phe 
355 

Gin He Glu Val 
370 

Phe Cys Glu Ser 
385 

Asp Ala Val Asn 

Ser Gin Leu Ala 
420 



Gly Pro Ser Pro 
5 

Pro Ala Met Ser 

Lys Thr Ser Asp 
40 

Pro Gin Lys Ser 
55 

Ser His Val Ala 
70 

Ser Ser Arg Thr 
85 

Ala His Phe Gly 

Glu Thr Lys Ser 
120 

Thr He Val Leu 
135 

Glu His Leu Val 
150 

Thr Trp Ser Arg 
165 

Ser Leu Ala Glu 

Ser Phe Leu Thr 
200 

Ala Gin Leu Phe 
215 

Thr Asn Ser Thr 
230 

Ala His Ser Leu 
245 

Ser Met Glu Thr 

Asn Ser Pro Ala 
280 

Ser He Glu Gin 
295 

Asp Ala Ala Lys 
310 

He Ala Arg He 
325 

Val Leu Ala Gin 

Ser Glu Phe Leu 
360 

Leu Thr Ser Gly 
375 

Ala Leu Phe Tyr 
390 

He Leu Gin Phe 
405 

Ala Lys Lys Gly 



Arg Gin Ser Pro 
10 

Phe Phe Arg Arg 
25 

Val Lys Ser He 

Thr Lys Asn His 
60 

He Asn Ala He 
75 

Ala Thr Leu Lys 
90 

Asp Leu Gly Arg 
105 

Ser Leu Ser Lys 

Pro Tyr Phe He 
140 

Lys Phe Trp Leu 
155 

He Arg Ala His 
170 

Pro Val Ser Pro 
185 

Asp Ser Leu Asp 

Met Thr His Ser 
220 

Gin Asn His Leu 
235 

Arg Leu Glu Met 
250 

Gin Glu Ser Ser 
265 

Ser Pro Leu Lys 

Asp Ala Val Asn 
300 

Pro He Pro He 
315 

Cys Gly Glu Asp 
330 

Ser He Val Phe 
345 

Arg Ser His His 

Thr Val Tyr Leu 
380 

Phe Ser Glu Tyr 
395 

Trp Leu Ala Ala 
410 

Gin Tyr Asp Gly 
425 



Arg Thr Leu Arg 
15 

Lys Val Lys Gly 
30 

Lys Ala Ser He 
45 

Ala Leu Leu Glu 

Ser Ala Asn Met 
80 

Lys Gin Pro Ser 
95 

Ser Cys Leu Asp 
110 

Thr Leu Glu Gin 
125 

Gin Phe Met Glu 

Glu Ala Glu Ser 
160 

Ser Leu Asn Thr 
175 

Ser Lys Lys His 
190 

Lys Arg Leu Glu 
205 

Glu Gly He Asp 

Leu Leu Ser Gin 
240 

Ala Arg Ala Gly 
255 

Ser Thr Leu Thr 
270 

Glu Leu Ser Gly 
285 

Thr Phe Thr Lys 

Thr Glu Ala Met 
320 

Gly Gin Val Asp 
335 

Ser Ala Met Glu 
350 

Phe Cys Lys Tyr 
365 

Ala Asp He Leu 

Met Glu Lys Glu 
400 

Asp Asn Phe Gin 
415 

Gin Glu Ala Gin 
430 
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Asn Asp Ala Met 
435 

His Pro Leu Gly 
450 

lie Cys Arg Glu 
465 

Arg Gin Ala Trp 

Ser Ser Asn Leu 
500 

Arg Gly Asp Glu 
515 

Ser Val Gly Pro 
530 

Ala Ser Gin Ser 
54 5 

Phe Asp Glu Ala 

Leu Tyr Gin Arg 
580 

Asp Leu Gly Gin 
595 

Lys Ser Lys Gly 
610 

Gly Asn Thr Asp 
625 

Met lie Val Ser 

Glu Lys Ser Thr 
660 



lie Leu Tyr Asp 
440 

Phe Asp Asp Val 
455 

Gly Gly Pro Leu 
470 

Thr Thr Met Glu 
485 

Tyr Tyr Lys Tyr 

Phe Leu Gly Gly 
520 

Pro Asp Glu Ser 
535 

Ser Val Lys Lys 
550 

lie lie Val Asp 
565 

Thr Tyr Ala Gly 

Phe lie Arg Glu 
600 

Ser Met Phe Ser 
615 

Glu Ala Gin Glu 
630 

Asp lie Met Gin 
645 

Lys Leu 



Lys Tyr Phe Ser 

Val Arg Leu Glu 
460 

Pro Asn Cys Phe 
475 

Lys Val Phe Leu 
490 

Leu Asn Asp Leu 
505 

Asn Val Ser Pro 

His Pro Gly Ser 
540 

Ala Ser lie Lys 
555 

Ala Ala Ser Leu 
570 

Lys Met Thr Phe 
585 

Ser Glu Pro Glu 

Gin Ala Met Lys 
620 

Glu Leu Ala Trp 
635 

Gin Ala Gin Tyr 
650 



Leu Gin Ala Thr 
445 

lie Glu Ser Asn 

Thr Thr Pro Leu 
480 

Pro Gly Phe Leu 
495 

He His Ser Val 
510 

Thr Ala Pro Gly 
525 

Ser Asp Ser Ser 

He Leu Lys Asn 
560 

Asp Pro Glu Ser 
575 

Gly Arg Val Ser 
590 

Pro Asp Val Arg 
605 

Lys Trp Val Gin 

Lys He Ala Lys 
640 

Asp Gin Pro Leu 
655 



<210> 33 

<211> 2363 

<212> DNA 

<213> Homo Sapien 

<220> 
<221> CDS 

<222> (138) . . . (2126) 
<223> AKAP-10-5 



<221> allele 
<222> 2073 

<223> Single Nucleotide Polymorphism: A to G 
<400> 33 

gcggcttgtt gataatatgg cggctggagc tgcctgggca tcccgaggag gcggtggggc 
ccactcccgg aagaagggtc ccttttcgcg ctagtgcagc ggcccctctg gacccggaag 
tccgggccgg ttgctga atg agg gga gcc ggg ccc tec ccg cgc cag tec 

Met Arg Gly Ala Gly Pro Ser Pro Arg Gin Ser 
15 10 



60 
120 
170 



ccc cgc acc etc cgt ccc gac ccg ggc ccc gcc atg tec ttc ttc egg 
Pro Arg Thr Leu Arg Pro Asp Pro Gly Pro Ala Met Ser Phe Phe Arg 
15 20 25 



218 



egg aaa gtg aaa ggc aaa gaa caa gag aag acc tea gat gtg aag tec 
Arg Lys Val Lys Gly Lys Glu Gin Glu Lys Thr Ser Asp Val Lys Ser 
30 35 40 



266 
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ate aaa get tea ata tec gta cat tec cca caa aaa age act aaa aat 314 
lie Lys Ala Ser lie Ser Val His Ser Pro Gin Lys Ser Thr Lys Asn 
45 50 55 

cat gec ttg ctg gag get gca gga cca agt cat gtt gca ate aat gec 362 
His Ala Leu Leu Glu Ala Ala Gly Pro Ser His Val Ala lie Asn Ala 
60 65 70 75 

att tct gec aac atg gac tec ttt tea agt age agg aca gee aca ctt 410 
lie Ser Ala Asn Met Asp Ser Phe Ser Ser Ser Arg Thr Ala Thr Leu 
80 85 90 

aaq a.i i cag cca age cac atg gag get get cat ttt ggt gac ctg ggc 4 58 

Lyt, Lvc Gin Pro Ser His Met Glu Ala Ala His Phe Gly Asp Leu Gly 
95 100 105 

ag.i t ~t tgt ctg gac tac cag act caa gag ace aaa tea age ctt tct 506 
Arg <>i eye Leu Asp Tyr Gin Thr Gin Glu Thr Lys Ser Ser Leu Ser 
lio 115 120 

aag arr ctt gaa caa gtc ttg cac gac act att gtc etc cct tac ttc 554 
Lyr Thr L**u Glu Gin Val Leu His Asp Thr lie Val Leu Pro Tyr Phe 

130 135 

att c.i* ttc atg gaa ctt egg cga atg gag cat ttg gtg aaa ttt tgg 602 
lie Gin r he Met Glu Leu Arg Arg Met Glu His Leu Val Lys Phe Trp 
140 145 150 155 

tta gaq qct gaa agt ttt cat tea aca act tgg teg cga ata aga gca 650 
Leu Glu Ala Glu Ser Phe His Ser Thr Thr Trp Ser Arg lie Arg Ala 
160 165 170 

cac agt eta aac aca atg aag cag age tea ctg get gag cct gtc tct 698 
His Ser Leu Asn Thr Met Lys Gin Ser Ser Leu Ala Glu Pro Val Ser 
175 180 185 

cca tct aaa aag cat gaa act aca gcg tct ttt tta act gat tct ctt 746 
Pro Ser Lys Lys His Glu Thr Thr Ala Ser Phe Leu Thr Asp Ser Leu 
190 195 200 

gat aag aga ttg gag gat tct ggc tea gca cag ttg ttt atg act cat 794 
Asp Lys Arg Leu Glu Asp Ser Gly Ser Ala Gin Leu Phe Met Thr His 
205 210 215 

tea gaa gga att gac ctg aat aat aga act aac age act cag aat cac 842 
Ser Glu Gly lie Asp Leu Asn Asn Arg Thr Asn Ser Thr Gin Asn His 
220 225 230 235 

ttg ctg ctt tec cag gaa tgt gac agt gec cat tct etc cgt ctt gaa 890 
Leu Leu Leu Ser Gin Glu Cys Asp Ser Ala His Ser Leu Arg Leu Glu 
240 245 250 

atg gec aga gca gga act cac caa gtt tec atg gaa acc caa gaa tct 938 
Met Ala Arg Ala Gly Thr His Gin Val Ser Met Glu Thr Gin Glu Ser 
255 260 265 

tec tct aca ctt aca gta gec agt aga aat agt ccc get tct cca eta 986 
Ser Ser Thr Leu Thr Val Ala Ser Arg Asn Ser Pro Ala Ser Pro Leu 
270 275 280 
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aaa gaa ttg tea gga aaa eta atg aaa agt ata gaa caa gat gca gtg 1034 
Lys Glu Leu Ser Gly Lys Leu Met Lys Ser lie Glu Gin Asp Ala Val 
285 290 295 

aat act ttt acc aaa tat ata tct cca gat get get aaa cca ata cca 1082 
Asn Thr Phe Thr Lys Tyr lie Ser Pro Asp Ala Ala Lys Pro lie Pro 
300 305 310 315 

att aca gaa gca atg aga aat gac ate ata gca agg att tgt gga gaa 1130 
He Thr Glu Ala Met Arg Asn Asp He He Ala Arg He Cys Gly Glu 
320 325 330 

gat gga cag gtg gat ccc aac tgt ttc gtt ttg gca cag tec ata gtc 1178 
Asp Gly Gin Val Asp Pro Asn Cys Phe Val Leu Ala Gin Ser He Val 
335 340 345 

ttt agt gca atg gag caa gag cac ttt agt gag ttt ctg cga agt cac 1226 
Phe Ser Ala Met Glu Gin Glu His Phe Ser Glu Phe Leu Arg Ser His 
350 355 360 

cat ttc tgt aaa tac cag att gaa gtg ctg acc agt gga act gtt tac 1274 
His Phe Cys Lys Tyr Gin He Glu Val Leu Thr Ser Gly Thr Val Tyr 
365 370 375 

ctg get gac att etc ttc tgt gag tea gee etc ttt tat ttc tct gag 1322 
Leu Ala Asp He Leu Phe Cys Glu Ser Ala Leu Phe Tyr Phe Ser Glu 
380 385 390 395 

tac atg gaa aaa gag gat gca gtg aat ate tta caa ttc tgg ttg gca 1370 
Tyr Met Glu Lys Glu Asp Ala Val Asn He Leu Gin Phe Trp Leu Ala 
400 405 410 

gca gat aac ttc cag tct cag ctt get gee aaa aag ggg caa tat gat 1418 
Ala Asp Asn Phe Gin Ser Gin Leu Ala Ala Lys Lys Gly Gin Tyr Asp 
415 420 425 ' 

gga cag gag gca cag aat gat gee atg att tta tat gac aag tac ttc 1466 
Gly Gin Glu Ala Gin Asn Asp Ala Met He Leu Tyr Asp Lys Tyr Phe 
430 435 440 

tec etc caa gee aca cat cct ctt gga ttt gat gat gtt gta cga tta 1514 
Ser Leu Gin Ala Thr His Pro Leu Gly Phe Asp Asp Val Val Arg Leu 
445 450 455 

gaa att gaa tec aat ate tgc agg gaa ggt ggg cca etc ccc aac tgt 1562 
Glu He Glu Ser Asn He Cys Arg Glu Gly Gly Pro Leu Pro Asn Cys 
460 465 470 475 

ttc aca act cca tta cgt cag gee tgg aca acc atg gag aag gtc ttt 1610 
Phe Thr Thr Pro Leu Arg Gin Ala Trp Thr Thr Met Glu Lys Val Phe 
480 485 490 

ttg cct ggc ttt ctg tec age aat ctt tat tat aaa tat ttg aat gat 1658 
Leu Pro Gly Phe Leu Ser Ser Asn Leu Tyr Tyr Lys Tyr Leu Asn Asp 
495 500 505 

etc ate cat teg gtt cga gga gat gaa ttt ctg ggc ggg aac gtg teg 1706 
Leu He His Ser Val Arg Gly Asp Glu Phe Leu Gly Gly Asn Val Ser 
510 515 520 
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ccg act get cct ggc tct gtt ggc cct cct gat gag tct cac cca ggg 1754 
Pro Thr Ala Pro Gly Ser Val Gly Pro Pro Asp Glu Ser His Pro Glv 
525 530 535 

agt tct gac age tct gcg tct cag tec agt gtg aaa aaa gec agt att 1802 
Ser Ser Asp Ser Ser Ala Ser Gin Ser Ser Val Lys Lys Ala Ser lie 
540 545 550 555 

aaa ata ctg aaa aat ttt gat gaa gcg ata att gtg gat gcg gca agt 1850 
Lys lie Leu Lys Asn Phe Asp Glu Ala lie lie Val Asp Ala Ala Ser 
560 565 570 

ctg gat cca gaa tct tta tat caa egg aca tat gee ggg aag atg aca 1898 
Leu Asp Pro Glu Ser Leu Tyr Gin Arg Thr Tyr Ala Gly Lys Met Thr 
575 580 585 

ttt gga aga gtg agt gac ttg ggg caa ttc ate egg gaa tct gag cct 1946 
Phe Gly Arg Val Ser Asp Leu Gly Gin Phe lie Arg Glu Ser Glu Pro 
590 595 600 

gaa cct gat gta agg aaa tea aaa gga tec atg ttc tea caa get atg 1994 
Glu Pro Asp Val Arg Lys Ser Lys Gly Ser Met Phe Ser Gin Ala Met 
605 610 615 

aag aaa tgg gtg caa gga aat act gat gag gec cag gaa gag eta get 2 04 2 

Lys Lys Trp Val Gin Gly Asn Thr Asp Glu Ala Gin Glu Glu Leu Ala 
620 625 630 635 

tgg aag att get aaa atg ata gtc agt gac gtt atg cag cag get cag 2090 
Trp Lys lie Ala Lys Met lie Val Ser Asp Val Met Gin Gin Ala Gin 
640 645 650 

tat gat caa ccg tta gag aaa tct aca aag tta tga ctcaaaactt 2136 
Tyr Asp Gin Pro Leu Glu Lys Ser Thr Lys Leu * 
655 660 

gagataaagg aaatctgett gtgaaaaata agagaacttt tttcccttgg ttggattctt 2196 

caacacagcc aatgaaaaca gcactatatt tctgatctgt cactgttgtt tccagggaga 2256 

gaatggggag acaatcctag gacttccacc etaatgeagt tacctgtagg gcataattgg 2316 

atggcacatg atgtt.tcaca cagtgaggag tctttaaagg ttaccaa 2 363 

<210> 34 

<211> 662 

<212> PRT 

<213> Homo Sapien 



<400> 34 



Met 


Arg 


Gly Ala 


Gly 


Pro 


Ser 


Pro 


Arg 


Gin 


Ser 


Pro 


Arg 


Thr 


Leu 


Arg 


1 








5 










10 










15 


Pro 


Asp 


Pro 


Gly 


Pro 


Ala 


Met 


Ser 


Phe 


Phe 


Arg 


Arg 


Lys 


Val 


Lys 


Gly 








20 










25 










30 




Lys 


Glu 


Gin 


Glu 


Lys 


Thr 


Ser 


Asp 


Val 


Lys 


Ser 


He 


Lys 


Ala 


Ser 


He 






35 










40 








45 








Ser 


Val 


His 


Ser 


Pro 


Gin 


Lys 


Ser 


Thr 


Lys 


Asn 


His 


Ala 


Leu 


Leu 


Glu 




50 










55 








60 










Ala 


Ala 


Gly 


Pro 


Ser 


His 


Val 


Ala 


He 


Asn 


Ala 


He 


Ser 


Ala 


Asn 


Met 


65 










70 










75 










80 


Asp 


Ser 


Phe 


Ser 


Ser 


Ser 


Arg 


Thr 


Ala 


Thr 


Leu 


Lys 


Lys 


Gin 


Pro 


Ser 










85 










90 








95 




His 


Met 


Glu 


Ala 


Ala 


His 


Phe 


Gly 


Asp 


Leu 


Gly Arg 


Ser 


Cys 


Leu 


Asp 
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100 

Tyr Gin Thr Gin 
115 

Val Leu His Asp 
130 

Leu Arg Arg Met 
145 

Phe His Ser Thr 

Met Lys Gin Ser 
180 

Glu ThT Thr Ala 
195 

Asp iirr Gly Ser 
710 

Leu Asn Asn Arg 
225 

Glu Cys Asp Ser 

Thr Han Gin Val 
260 

Val Ala Gcr Arg 
275 

Lyc i.eu Met Lys 
290 

Tyr lie Ser Pro 
305 

Arg Acn Asp lie 

Pro Asn Cys Phe 
340 

Gin Glu His Phe 
355 

Gin lie Glu Val 
370 

Phe Cys Glu Ser 
385 

Asp Ala Val Asn 

Ser Gin Leu Ala 
420 

Asn Asp Ala Met 
435 

His Pro Leu Gly 
450 

lie Cys Arg Glu 
465 

Arg Gin Ala Trp 

Ser Ser Asn Leu 

500 

Arg Gly Asp Glu 
515 

Ser Val Gly Pro 
530 

Ala Ser Gin Ser 
545 

Phe Asp Glu Ala 
Leu Tyr Gin Arg 



Glu Thr Lys Ser 
120 

Thr lie Val Leu 

. 135 
Glu His Leu Val 
150 

Thr Trp Ser Arg 
165 

Ser Leu Ala Glu 

Ser Phe Leu Thr 
200 

Ala Gin Leu Phe 
215 

Thr Asn Ser Thr 
230 

Ala His Ser Leu 
245 

Ser Met Glu Thr 

Asn Ser Pro Ala 
280 

Ser He Glu Gin 
295 

Asp Ala Ala Lys 
310 

He Ala Arg He 
325 

Val Leu Ala Gin 

Ser Glu Phe Leu 
360 

Leu Thr Ser Gly 
375 

Ala Leu Phe Tyr 
390 

He Leu Gin Phe 
405 

Ala Lys Lys Gly 

He Leu Tyr Asp 
440 

Phe Asp Asp Val 
455 

Gly Gly Pro Leu 
470 

Thr Thr Met Glu 
485 

Tyr Tyr Lys Tyr 

Phe Leu Gly Gly 
520 

Pro Asp Glu Ser 
535 

Ser Val Lys Lys 
550 

He He Val Asp 
565 

Thr Tyr Ala Gly 



105 

Ser Leu Ser Lys 

Pro Tyr Phe He 
140 

Lys Phe Trp Leu 
155 

He Arg Ala His 
170 

Pro Val Ser Pro 
185 

Asp Ser Leu Asp 

Met Thr His Ser 
220 

Gin Asn His Leu 
235 

Arg Leu Glu Met 
250 

Gin Glu Ser Ser 
265 

Ser Pro Leu Lys 

Asp Ala Val Asn 
300 

Pro lie Pro lie 
315 

Cys Gly Glu Asp 
330 

Ser He Val Phe 
345 

Arg Ser His His 

Thr Val Tyr Leu 
380 

Phe Ser Glu Tyr 
395 

Trp Leu Ala Ala 
410 

Gin Tyr Asp Gly 
425 

Lys Tyr Phe Ser 

Val Arg Leu Glu 
460 

Pro Asn Cys Phe 
475 

Lys Val Phe Leu 
490 

Leu Asn Asp Leu 
505 

Asn Val Ser Pro 

His Pro Gly Ser 
540 

Ala Ser He Lys 
555 

Ala Ala Ser Leu 
570 

Lys Met Thr Phe 



110 

Thr Leu Glu Gin 
125 

Gin Phe Met Glu 

Glu Ala Glu Ser 
160 

Ser Leu Asn Thr 
175 

Ser Lys Lys His 
190 

Lys Arg Leu Glu 
205 

Glu Gly He Asp 

Leu Leu Ser Gin 
240 

Ala Arg Ala Gly 
255 

Ser Thr Leu Thr 
270 

Glu Leu Ser Gly 
285 

Thr Phe Thr Lys 

Thr Glu Ala Met 
320 

Gly Gin Val Asp 
335 

Ser Ala Met Glu 
350 

Phe Cys Lys Tyr 
365 

Ala Asp He Leu 

Met Glu Lys Glu 
400 

Asp Asn Phe Gin 
415 

Gin Glu Ala Gin 
430 

Leu Gin Ala Thr 
445 

He Glu Ser Asn 

Thr Thr Pro Leu 
480 

Pro Gly Phe Leu 
495 

He His Ser Val 

510 

Thr Ala Pro Gly 
525 

Ser Asp Ser Ser 

He Leu Lys Asn 
560 

Asp Pro Glu Ser 
575 

Gly Arg Val Ser 
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580 585 590 

Asp Leu Gly Gin Phe lie Arg Glu Ser Glu Pro Glu Pro Asp Val Arg 

595 600 605 

Lys Ser Lys Gly Ser Met Phe Ser Gin Ala Met Lys Lys Trp Val Gin 

610 615 620 

Gly Asn Thr Asp Glu Ala Gin Glu Glu Leu Ala Trp Lys lie Ala Lys 
625 630 635 640 

Met lie Val Ser Asp Val Met Gin Gin Ala Gin Tyr Asp Gin Pro Leu 

645 650 655 

Glu Lys Ser Thr Lys Leu 
660 

<210> 35 

<211> 162025 

<212> DNA 

*213> Homo Sapien 

. 300> 

- 308* GenBank AC005730 
<30 r >> 1998-10-22 

«-400» 35 

gaattcctat ttcaaaagaa acaaatgggc caagtatggt ggctcatacc tgtaatccca 60 

ycactttggg aggccgaggt gagtgggtca cttgaggtca ggagttccag gccagtctgg 120 

cc^^catggt gaaacactgt ctctactaaa aatacaaaaa ttagccgggc gtggtggcgg 180 

gcacctgtaa tcccagctac tcaggaggct gaggcaggag aattgcttga acctgggaga 240 

tgqaggttgc agtgagccga gatcgcgcca ctgctctcca gcctgggtgg cagagtgaga 300 

ctctgcctca aaaagaaaca aagaaataaa tgaaacaatt ttgttcacat atatttcaca 360 

aatttgaaat gttaaaggta ttatggtcac tgatatcctg tttcattctt tatataatca 420 

ttaagtttga aatgtatact tgcactacta acacagtagt taatcttagt cctacaagtt 480 

artgctttta cacaatatat tttcgtaata tgtatgcact ggtgtttatg tacgtgttta 540 

tgtttatatc tgttaaaatt agcagtttcc atctttttct attttgtacc atcacatcag 600 

ttcagaagga ttgacagagc aaaatgattt gatgaagtat aaaagtcaca tggtgagtgg 660 

cataaataca actctgaaca attaggaggc tcactattga ctggaactaa actgcaagcc 720 

agaaagacac atatcctata tgtcaagaga tgtaccaccc aggcagttaa agaagggaag 780 

tacacataga aagcacaatg gtgaataatt aaaaaattgg aatttatcag acactggatt 840 

catttgctcc taaagtcaga gtcctctatt gtttttttgt ttttgtgggt ttctttttaa 900 

atttttttat tttttgtaga gtcggagtct cactgtgtta cccgggctgg tctagaactc 960 

ctggcctcaa acaaacctcc tgcctcagct tcccaaagca ttgggattac agacatgagc 1020 

cactgagccc agcccagacg ctttagcatt tatgaagctt ctgaaatagt tgtagaaacc 1080 

gcataagctt tccatgtcac tttcaaagtt tgatggtctc tttagtaaac caaccaagtt 1140 

attcctcaag ggcaaaataa catttctcag tgcaaaactg atgcacttca ttaccaaaag 1200 

gaaaagacca caactataga ggcgtcattg aaagctgcac tcttcagagg ccaaaaaaaa 1260 

aggtacaaac acatactaat ggaacattct ttagaagagc cccaaagtta atgataaaca 1320 

ttttcatcaa agagaaaaga gaacaaggtg ttagcaaatt cctctatcaa ataacactaa 1380 

acatcaagga acatcaatgg catgccatgt ggaagaggaa gtgctagctc atgtacaaac 1440 

cagtagataa tttcaacttg ctgccgaatg aaacctcttt gcaaggtatg aatcagcact 1500 

tctcatgttt gttttgcttt gttttgtttt gtttttagag acaggccctt gctctgtcac 1560 

acaggctgga gtgcagtggc acgatcagag ctcactgcaa cctgaaactc ctgggctcaa 1620 

gggatcctcc tgccttagcc tcccaagtag ctgggactac aggcccacca tgcccagcta 1680 

attttttaaa ttttctatag agatgggatc tcactagcac ctttcatgtt tgatgttcat 1740 

atacaacgac caaggtacaa tgtggaaaag ggtctcaggg atctaaagtg aaggaggacc 1800 

agaaagaaaa ggggttgcta catagagtag aagaagttgc acttcatgcc agtctacaac 1860 

actgctgttt tcctcagagc agagttgatg atctaaatca ggggtcccca acccccagtt 1920 

catagcctgt taggaaccgg gccacacagc aggaggtgag caataggcaa gcgagcatta 1980 

ccacctgggc ttcacctccc gtcagatcag tgatgtcatt agattctcat aggaccatga 2040 

accctattgt gaactgagca tgcaagggat gtaggttttc cgctctttat gagactctaa 2100 

tgccggaaga tctgtcactg tcttccatca ccctgagatg ggaacatcta gttgcaggaa 2160 

aacaacctca gggctcccat tgattctata ttacagtgag ttgtatcatt atttcattct 2220 

atattacaat gtaataataa tagaaataaa ggcacaatag gccaggcgtg gtggctcaca 2280 
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cctgtaatcc cagcacttcg ggaggccaag gcaggcggat cacgaggtca ggagatcgag 234 0 

accatcccgg ctaaaacggt gaaaccccgt ctactaaaaa ttcaaaaaaa aattagccgg 24 00 

gcgtggtggt gggcacctgt agtcccagct actcgagagg ctgaggcagg agaatggtgc 24 60 

gaacctggga ggcagagctt gaggtaagcc gagatcacgc cactgcactc cagcctgggc 2520 

gacagagcga tactctgtct caaaaaaaaa aaaaaaaaaa aaagaaataa agtgaacaac 2580 

aaatgtaatg tggctgaatc attccaaaac aatcccccca ccccagttca cggaaaaatt 2640 

ctcccacaaa accagtccct ggtgccaaaa aggttgggga ccgctaatct aaataatcta 2700 

atcttcattc aatgctaaaa aatgaataaa ctttttttta aatacacggt ctcactttgt 2760 

tgcccaggct ggagtacggt ggcatgatca cagctcactg tagcctcaat cacccaggcc 2820 

ccagcgatcc tcccacctaa acttcctgag tagctgggac tacaggcacg caccaccatg 2880 

cccagctaat ttttaaattt tttatagaga tgggggtctc accatgttgc ccagactggt 2 940 

ctcaaaccct gggctcaagt gatcctccct caaactcctg gactcaagtg atcctccttc 3000 

cttggcctcc caaagtgctg ggattacaag catgagccac tgtacccagc tggataaaca 3 060 

ttttaagtcg cactacagtc atggacaatc aggcttttca acatgcagta tggacagtga 3120 

gtcccagggt ctgcttttcc atactgaaat acatgtgata ctaaggagaa aggtgctcgc 3180 

aaggatattt aaaatgaaga atatttaaaa tgaggaaaaa actgtttctt catgactttg 3240 

ataaggctga taaagaccat ttctgtgatc tcaggtgatt cactcaagta gtatatttca 3300 

gtaatcatta tctggaacag cctgaatctt aaccaaaata ccatgatttt ttaatgctgt 3360 

tatgatacct tgatgatatg accaaactgc aatgtaggca gctaaatctc cacgagtttg 3420 

acttccccga gagttgacag ttttcttcac aaattaaaga aatatatttt ttgatacatg 3480 

attggcatat ttaaaaacta cactgaaatg ctgcaaaatg atataaagaa acattttcca 3 540 

gaatcaaatg caatcaaaga gtggattagg aatctactca ccattatcaa ctaaatagaa 3600 

acacttggac tgggtgtggt ggctcacatc tgtaatctca gcactttggg aggccaaggc 3660 

aggtggattg cttgaggcca ggagctcaag accagcctga gcaacatagc aaaactctgt 3720 

ctctacaaaa aaaaaaaaaa attaaccagg catggtggca gatgcttgta atcccagcta 3780 

ctctggaagc tgaagtagga ggactgcttg agcccaggag atcaagactg cagtgagccg 3 84 0 

tggtcatgct gcgccacagc ctgagtgaca gagagagacc ctgtctcaaa aacaaaaaca 3 900 

aacaaaaaac acttaacctt cctgtttttt gctgttgttg ttgttgtttg tttgttttga 3 960 

gatggagtct cactctgttg cccaggctgg agtgcagtgg cgtgatcttg gctcactgca 4020 

agctctgcct cccgggttca cgccattctc ctgcctcagc ctcccgagta gctgggacta 4 080 

taggcgcccg ccaccacgcc cggctacttt tttgcatttt tagtagagat ggggtttcac 414 0 

cgtgttagcc aggatggtct tgatctcctg acctcgtgat ccacctgcct cggcctccca 4200 

aagtgctggg attacaggca tgagccaccg cacccggcca acctttctgt tttttagttt 4260 

gatatgcttg ttaactcagc agctgaaaga atgctgaaag tggccttcag taaaaaaatt 4 320 

tcactagaat ctctacatcc atatttaatc tgaatgcata tccagattga tcagttagag 4380 

caaaaacact catcatcatt cctgatgacc tctaattctg gtttcggctt tctatttcaa 4440 

tggaaacaga ataaggaaag aaatggaagg gctctggaaa tttgtcctgg gctatagata 4500 

ctatcaaaga tcaccaacaa taagatctct cctataaata taaaacaagt ataattaatt 4560 

ttttaattat ttttttctct tcagaggatt ttatttcaag ataaaacata acttctaccc 4620 

atactattga ttccaaaggt tagaaaaagt gtttttcctc atcttatcct tcaaagaggt 4680 

cacagcaatg caaacatcta taaaatgcct ctgcataatt gtcagaagct atagtccaga 4 740 

aatcattgaa aatgcttttc cattttaagc ttaggtgagg tgtcttagga aacctctatg 4 800 

acaacttact ctatttattg ggaggtaaac tcccagactc tcccagggtc tcctgtattg 4860 

atctcatttt ttaggcttcc taatcccttg aagcacaatc gaaaaagccc tggatctctt 4 920 

ttctgcacat atcatcgcgg aattcattcg gcttccagca agctgacact ccatgataca 4980 

agcggcctcg cccttctccg gacgccagtc cttgctgcgg ttagctagga tgaggggttt 5040 

gctgggcttc agtgcaggct tctgcgggtt cccaagccgc accaggtggc ctcacaggct 5100 

ggatgtcacc attgcacact gagctcctgg caggctgtac caatttttta attatttaat 5160 

atttattttt aaaattatgg tgaatatttt ggtattctgc tctaaaatag gcccataaat 5220 

gcacagcaga tatctcttgg aacccacagc tttccactgg aagaactaag tatttttctt 5280 

ttaaagatgc tactaagtct ctgaaaagtc cagatcctct acctctttcc atcccaaact 5340 

aagacttgga atttatgaga gatctagcta acagaaatcc cagacacatc attggttctt 54 00 

cccagagtgc agtcctccta aagaggctca gccctaagca ggcccctgca ccaggagggt 54 60 

gggtctgaga cccacatagc acttcccaag gtgcatgctc cagagaggca ctgaaacagc 5520 

tgagcacaag cctgcaagcc tggagaactc tcacagtcag aacggagggg gcccagtggg 5580 

actaacataa agagaaaagg gaacacagag aaatggatgg caccaacaac cagcaaagcc 5640 

ttcatggcca atgaaagcat cagtgacggg gccagaaccc tcatccccaa agactcttca 5700 

ctgcctttag tgaaaaacaa tggctagaga gtgaagttat gatcatgtat agagaggtaa 5760 

agttacattt ttatattctg actctgctaa tgtgaaattc cctatctgct agactaaaag 5820 

tttcagacac cctgttcaaa tatcccatta gttgctagag acttaaaatg aacagaacgc 5880 
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acattgtcag gatgactatt accaaaaaat caaaagacag caagtattgg tgaggatgta 594 0 

gagaaaccgg aacttttgtg cactgtttat gagaatgtaa aatggagcag ctgctgtgga 6000 

aaagagcacg caggttcctc aaagagtaaa accaagatgc ggaaacaact aaatgcccat 6060 

cagtggatga aggggtagac aatatgtggt atatacatac catggagtac tattcagcct 6120 

ctaaaaaaaa aaaaggaaat tccataacat gcaacagcat ggatgaatct tgaggacatc 6180 

tcgctaatga aataaggcag tcatagaaag acaaatactg cacgactcca cttatacgag 6240 

ataccaaaaa tagacaaatt catagaatca aagagtacaa tggaggttac ctggagctgc 6300 

a 99g c ggg aa acgaggagtt actaatcaac gaacataacg ttgcagttaa gtaagacgaa 6360 

taagctctca agatcagctg tacaacactg cacctagagt caacaataat gtattgtaca 6420 

cttaaaaact tgttaagggt agattaacaa atgtagtaga tccacaaatg tggttaagtg 6480 

ttcttaccac agtaaaataa aaaaagaata tcaagcccag gagttcgaga ctagcctggg 654 0 

taacatggtg aaaccctgtc tctacagaaa atacaaaaat tagccagctg tggaggtgca 6600 

ctcctaggga ggctgaggtg ggaggcttgc ttgagcccag gaggtcaagg ctgcagtgag 6660 

ccatgattgc accactgtac tccagcccag atgacagagc aagacaccac cccccccaaa 6720 

aaaagaaaa.i gaatatcaaa cattttaaaa gatcagatac gcaagaacaa caacaaaaaa 6780 

gagatg.iaca gagcatcgac cctcatctag tgggattctt ggtctaactg aaaaacagac 6840 

attgag^gac aaacaatgac agtgatgtga tcacagcaat tacacaggta tcccctgggg 6 900 

actgcag^ag aaaggaggaa tgcctaactt tcagaaaata gagaaagcgt caaacagttg 6 960 

gtgaaagcct tccaaaacta gagagaactg cacacaccaa atcacagaaa gaagaaaagc 7020 

cgtgggagat tctgggaccc accggctatt tttgatggct gaacaccctg ctgcaggaga 7080 

gacaggagct ugaaagcang gtgggatgaa acctcaaaca gctttgcctg cattgcttaa 7140 

gatgactggg cttgattaac tctagtcaat ggggacaatt caatcaaaga agaaagatgc 7200 

tcaaattcac attttagaat gattttttat ggcagtatgg ggaatagatt aaaagagagt 7260 

gaagctggag gcaugaaact tgttaagagg caactgaaac agtctagatg ataaataata 7320 

aactgacaga gtgactagaa aaatcagaac aggctgaatc aacagatacc tagatgaaaa 73 80 

taacaggact tgatcaccag ttgtatcttg gagaggaagg agttgtttcc ttgctttccc 7440 

tacgactggq aatacggaag gtttgccgtg tgtattggtt atatactggt gtgtagccaa 7500 

tcactgaca*.* ccacctagca gcttaaaaca caaaggctta tctcccagtt tctgtgggcc 7560 

aggaatctaa gataggctta gctggctggt tctggctcag agtttctcaa gaggttgcaa 7620 

tcaagatgtc agctggggtt gcatcatctg aaggctcaac tggggccgga gggtccactt 7680 

ccaaggagtc cactcacctg cctgacaagg cagtgctggt tgttggcagg agatctcaat 774 0 

tcattgccaa gtgagcctct ctatagcatt gctggaacat cctccccatc tggcagttgg 7800 

cttctctcag catgagtgat ctgagagaga gagcaaggag gaagccacag tgttcttcct 7860 

actcctaccc ctaacactat ggacctactc ctaacactct cacttctgcc ttattccatt 7920 

agttagaaag qgaactaagc tccacctctt gaaataagaa gtgtcaaaga atttgtggat 7980 

atatttaaau atcatcacac tgtggaagtg gatagggggt tcaattaatg ctgaacttga 8040 

aatgcctgag acattcaaat gtccaacagg caatgaacat acccatagat ggtcatgact 8100 

ttagcaagaa tagaggaaga tcacagaatt aaggaggaat tgaaaggtaa aagaagtgga 8160 

gtcagatccc ccccgaaaag tgagccatga aaggaacttt aactattgag ttagaggtca 8220 

gagtaggaaa tttcggtgga attctttttt aaagaaagga accatataag catgttttga 8280 

ggtagaggga gaataaatca gtagacaggg agaggtaaaa aacataaatg ataggggata 834 0 

gttgacaaag gtcttggcag aatcccttac ccattgactt ggggccaaga gagggacact 84 00 

tctttgtttg agggataagg aaaataagaa agaatgggtg ctatttagtg tggtcctgtc 8460 

tctagggcaa acgcataggt aacaaactgt gtgtgttagg aatatagatg tgacctcaca 8520 

ttgagactct cacctcaaat ccattttgtt gttacctgta ccttcctacc ttctctt ttt 8580 

gctacatgca gactgctgtt ttgtctxcct ggcctgttcc aggtttcagc attctggcat 8640 

atctgctacc ctgttcccaa acctctctag agtccatgct ccttccttgg atagtgtttg 8700 

attgggccac gtatctaaga agtgatgcct tcagttaggc ctgagaacct cctctatgga 8760 

aatctccacc agtgaccctg acagacttgg tatcttggag atgtcactgc tcccagcctg 8820 

tggtctagga gaacctcagc ctgggcctct agtagtatgg ataaggcgtt aaggtatctt 8880 

tgaaccagag tctgtcatat tcctcaatgt gggacagata aaacagtggt agtgctggtg 8940 

tttctgagct agaactctgg tttttggtct agattctttg atgtatgacc tttcagaggt 9000 

attaaaattt gttctaatac aatgttcaat acaaatgtag ttccttttct gttaggacct 9060 

caacaaaaca tgaccaactg tagatgaaca ttaaactatg acaattcatg gaaatgaata 9120 

cagtaatacc tgcggttccc ccattttagc agtcactatg gtgacatttg gcacaaatgg 9180 

ctatttaagg gtgcttttgt taaaacctac catcttacta ggcacatgat attgaaacta 924 0 

atgaaataat ggagaaactt cttaaaaact tttaatgaat aaagtgatga agtgataata 9300 

ttttagctgc tatttataaa gtgactatta caggtcaaac attcttctag ggtttttttg 9360 

ttgaagttgt cacatttaat ccttaataac ccactatgag tcaggtattc ttctctcccc 9420 

tttggacagt tggggaaatg ggggtcagag aggttaggta atttgctcag ggccacacaa 94 80 
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cctgcatgta gaaaatctga gatttgtaca ggaacgtatc aaactctgaa gtccatgctt 9540 

ctattttccc atgctgcctt tctaataaaa ggtaactaat gctactggat gccgccccca 9600 

aagtgagtca ctttcacccc accctacttg attttctcca taaaactaat cacatcctga 9660 

caacttattt attgctgatc tcccccacta gattataaac tcaataaaag caagatcctt 9720 

gtctgctgaa tatcagtacc taaaacgctg tctagcacag agcaagtaat taatatttgt 9780 

tgaatgaaca aataaaggaa aaaaattcaa aggaagaaaa agccctaaaa cagatgttta 9840 

cctaaacata cattttaaaa gaaagcatat aacaaattca ggacagaatt taaatttgat 9900 

tttttaaaga aataaccaag tgctagctgg gcacagtggc tcacacctgt aatcctagca 9960 

ctctgggagg ccgaggcagg cagatcactt gaggtcaaga gttcaagacc agcctggcca 10 020 

acatggtgaa acctgtctct actaaaaata cagaaattat ccaggcatgg tggcaggtcc 10080 

ctgtaacccc agctactcag gaggctgagt caggagaatt gcttgaaccc aggaggcaga 10140 

ggttgcagtg ggccaagatt gcaccactgc actccagcct gagtaacaaa gcaagactct 102 00 

gtctgaagga gaaggaaaga aagaaggaaa gaaggaaaga aggaaagaag gaaagaagga 102 6 0 

aagaaagaaa gaaagaaaga aagaaagaaa gaaagaaaga aagaaagaaa gaaagaaaga 103 20 

aagaaagaaa aagaaagaaa gaaagaaaga accaagtgct tatttgggac ctactatgct 103 80 

atgtttttcc atgcacgcta ttttcagtaa agcagttagc aaacttgcaa gatcataaca 10440 

acaaatatat gcttctataa ctctaaaatt gtgctttaag aagttcctct ttaccagctc 10500 

atgtatgcat tagttttcta agagttacta gtaacttttt ccctggagaa tatccacagc 10560 

cagtttattt aaccaaagga ggatgcttac taacatgaag ttatcaaatg tgagcctaag 10620 

ttgggccagt tcatgttaat atactccaga acaaaaacca tcctactgtc ctctgacaat 10680 

tttacctgaa aattcatttt ccacattacc aaggagccag ggtaggagaa tatagaaaga 10740 

ccacccaaga atccttactt ctttcagcaa aatcaattca aagtaggtaa ctaaacacat 10800 

gccctaacaa tgaatagcag attgtgctca gaagaatgat ctacaacatc ttactgtgaa 10860 

ggaactactg aaatattcca ataagacttc tctccaaaat gattttattg aatttgcatt 10920 

ttaaaaaata ttttaagcct aaattttaaa aggtttgata ttggtacatg aatagacaaa 10980 

cagacacgga ctagaccaag aattaggttc aaacatatac aggaatttaa tatacgataa 1104 0 

atctagtatt ccaaaggaac caacaaatgg tgttcagaca gcaggatagg catcaggaaa 11100 

aacacagctg ggcaccctac cttactccta acaccaggag taactgaagg agcaccaaat 11160 

atctatttat tttaattata gttttaagtt ctagggtacg tgtgcacaac atgcaggttt 11220 

attacatagg tatacatgtg ccatgttggt gaggagcacc aaatatttaa aagaaaaaaa 11280 

ttggccaggg gcggtggctc acacctgtaa tcccagcact ttgggaggcc aaggtgggca 113 4 0 

gatcacctga ggtcgggagt tcgagaccag cctgagcaac atggagaaac cccatctcta 11400 

ctaaaaatac aaaattagcc aggcatggtg gcacatgcct gtaatcccag ctacttggga 11460 

ggctgaggca ggagaatagc tttaatctgg gaggcacagg ttgcggtgag ctgagatatt 1152 0 

gcactccagc ctgggcaaca agagcaaaac ttcaactcaa aaaaattaat aaataaataa 11580 

aaacaaagaa agaaaagaaa aaaatgaaaa tagtataatt agcagaagaa aacaccgtag 1164 0 

aatcctcgga ctcttaggat ggggaatgcc tataatataa aaaccccgaa gttataaaag 11700 

agaaaatcac ctacatacaa accaaatctt tctacatgcc taaaacatag cacaaacaca 11760 

gctaaataat catagctgaa tgaactggga aaacaaaact tgactcatat ccagacagag 11820 

ttaattttcc tacacataaa gagtacctat ataaacccaa caaaaaaacc accactaacc 11880 

caaaataaaa atgtgacagg taatgaacag gtagttcaca gagaatacaa atggctcttc 11940 

ggcacataag atgctcagac tgacttttac ttatttattt tttgagagac agggtcccac 12000 

gatgttgccc aggttaggct caaactcctg ggctcaaatg atagtaccag gactacaggt 12060 

gtgccccacc gcacctggct cctcaaccac ctgtattaac aggaaatgca aaataaaact 12120 

ttcaaatcta ttttacctat tagaatggca aaaatttgaa aaacttcaaa catcatcatg 12180 

ttggtgag 33 tgtgaggaga ctggcactct cattttttgc tgatagcata tatatactga 12240 

tggcttctat ggaaagcaat ctggcagcgt ctatcaaatg tacaagtgca tatatccttt 12300 

gacaaagcaa ttccactcta ggaatgtgtt ctatatggtt gtgcttcctg gggctgggaa 12360 

ctgggagcta agggacaggg gcagaagata atcttctttt ccctccttcc ccgttaaaca 12420 

tgttgaattt tatatactgt aatatattat ttttcacaaa agataatttt taagcgatat 124 80 

gtctgggaat tttttttttt cttttctgag acagggtctc actctgtcat ccaggctgga 12540 

atgccatggt atgatctcag ctgactgcag cctcgacctc ctgggttcaa gcaatcctcc 12600 

cacctcagcc tcctgagtag ctgggactac aggcacgtgc catcatgcta atttttgtat 12660 

atacagggtc tcactatgtt gcccaggcta atgtcaaact cctaggctca agcaatccac 12720 

ccacctcagg ctccaaagtg ctgggattac aggcgtgagc caccgcgcct ggccctggga 12780 

attcttacaa aagaaaaaat atctactctc cccttctatt aaagtcaaaa cagagaagga 1284 0 

aattcaacct ataatgaaag tagagaaggg cctcaaccct gagcaacaaa cacaaaggct 12 900 

atttctgaga caggaatttg ctgaacaaaa tcgagggaag atgacaagaa tcaagactca 12960 

cttctcggct gggcgcagtg gctcacacct gtaatcccag cactttggga ggccgaggcg 13 020 

gacagatcac gaggtcagga gattgagacc atactggcta acacagtgaa acccagtctc 13080 
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tactaaaaat acaaaaaatt agccgggcgt ggtggcaggt gcctgtagtc ccagctactt 13140 

gggaagctga ggcaggagaa tggcgtgaac ccaggaagcg gagcttgcag tgagccgaga 13200 

tcacgccact gcactccagc ctgggtgaca gagcaagact ctgtctcaaa aaaaaaaaaa 13260 

aagactcatt tctctagatc ttgagccgta ttcaaattta tctcagctta gtgagaggtt 13320 

aaagcaagga atatccttcc ctgtgggccc tgctccttac tgaaggaagg taacggatga 13380 

gtcaaggaca ccaatggaga aaagcactaa caccattatc tgatgaacat tacgtgaaga 1344 0 

agggtaagaa gtgaagtgga attgctgaag aagtcagtga aagcggacat tcatttgggg 13500 

aaatggaata taggaaatcc ataaaagtga ttaaaaagat gttagaggct gaggcggggg 13560 

gaccacaggg tcaggagatc gagaccatcc tggctaacac ggtgaaaccc cacctctact 13620 

aaaaatacaa aaaattagcc aggcgtggtg gcaggcacct gtagtcccaa ctactcggga 13680 

gactgaggca ggagaatggc atgaacctgg gagacggagc ttgcagtgag ccgagatcac 1374 0 

gccactgcac tccagcctgg gtgacagagt gagactccat ctcaaaaaaa aaagttagat 13 800 

acgagagata aagatccaac agacacacaa ctgctaattc tgaacagaac aaaacaaatg 13860 

gcacaggaaa agaaaattta agatataaca ccggaaaact ttcctgaaat tgagtaactg 13 92 0 

aatctatagc ttgaaagggt ttagcatatg ccaagaaaaa tcagtagagt ccaaccagca 13 98 0 

caagacacat ctagcaaggc tggtgattct accaacacag agaaagaagt gggtgaccca 14 04 0 

taatgcggaa aaaggcagac catctgcagt cttctccaga acactggagt ctgaagacaa 1410 0 

aagaatgctg cctactgagc cagaagggag agaaagtgac ccaacacatc tttaccaagt 14160 

tagaatgtca cgcattatct aaaggctgca aaagccatga aagacatgaa agaacacaag 1422 0 

catttacaac atgaaagaac acaagcattc tcatactcaa gaatccttaa gaaaaatgta 14280 

gtcctaatcc agcccactga aagttaaatg tacttaatgt gctcattaat gggaacttca 14340 

tagcttcaaa tcagtctggt cccatctacc aacatctctc gcccggcttt cctgcaatag 14400 

tcagcacctt tccctcctcc cagtcttgtc ccctggagtc tgctctcagc atagcagagt 14460 

gaccacatca acacccaagt cagagccctc cagtgcgcac tggtctacaa agcccttccc 14520 

accccccacc ccacgtgccc tccggatcct tgtgacgtgt ctcctgcata ccctagcagc 14580 

cctggcctcc tcactgcccc tcctgtacat caggaaggcg actccttgag tcttggctct 14640 

ggccgcctcc tccacctgca gtgagttaac tcccttacct actctaggtc attgctcaaa 14700 

tgtcagcatc tcaatggggc cctccctgac taccctattt aaattctaca tactcccctt 14760 

gaccccatgg acctcactca ccctattcca cttttattct tacaatttag cacttgttct 14820 

cttctaacgt attctaagac ttactcattt attacattgt ttgccacccc ctctagtaca 14880 

taaactccag aggggcaggg atttctgtct atttattcat ttctttatcc ctaggacata 14 940 

gaacagggca tagttcagag tattcaatgt tatcaatgaa tgaactagca gtagtaccag 15000 

ttccagttag gcacagaatt aaatctaaat agaattaaat ctcatggtct gggttaacta 15060 

tggatagaaa attagatata attttaagaa gcctagaaag aaaaaattaa taatgtaaaa 15120 

ataatattaa tttgataata ataacaaaaa ctctgccagg cactgtggct caaatctgca 15180 

atcccagcta ctcaggaggc tgaggtggaa ggatcacttg agaccagagt tcaagactca 1524 0 

gcctaggcaa cacggcaaga aactgtctct aaaaaaatta aaacttaaat ttttaaaaaa 15300 

gaattctcaa agcgtcacaa aaactggaga ttaaggtaca ggaagtgtga agtaatatta 15360 

ctatgctaat ggtttttttt ttttttagaa aggtataacc aaaagatttc tttctcaagt 15420 

cgataaactg agaaagataa gcatatcttc caattaacag agggggagga aaagccagat 154 80 

acaacaaaat aagatataaa ttagtttcca gttgaaaaca agagtaggag ttattttgca 15540 

tcacctcacc tgtgacctcc cccagcccaa aaaacactac tgataaacag ggtagaaaag 15600 

catcatctca gataaagcag gaaaaactgc cacagtctca aaccacaaac tataagcaca 15660 

cacctggcca accctgccaa gtctgggctc agtaggagga acgtgctgag agctaggatg 15720 

taccaactta gacattctgt gggatacaga tgtccctgga agggtcacac catctcaaag 15780 

gcacctgtaa tgcccactga ttacagccac catatgtgag agagaaactc agggcactta 15840 

gagagtataa caagaacctt atgtcatctg agatgaggaa tcctcagccc tgcaaattaa 15900 

ccaactcttt agaacaactg gcaaaacata aatatccaca acttttgttt cagtaattcc 15960 

actcttagat atcaatccaa agtacatgag acagcagata cacacacaaa atggtattta 16020 

ctgcagcatt gtttataata gcaaaaaaca agaaataatc catatgtctc aataggatac 16080 

tgggtacatg agggtatgta cccatcattc aaccatcaaa aagagtgata tggatgtcca 16140 

cagatggaca taaaaagctg tgtgttacgt gaaaacaaac tcaagcagca gcaggatggg 16200 

cttatgatag tcagtatgag ctaatttctg gaaaaaaaaa tctagtgtgt gcacagaaaa 162 60 

catctgaaag aacagaaaca aaactatcag cagaatattg agatgtttta ctaagttgta 16320 

tatctatact gcttgtaatt tttaccccaa gcaagaatta ctttttggaa aaagaaaatt 16380 

caggaaataa agcatttctt taaacttcat gtttaaacaa atggtgatgg aataaaagag 1644 0 

ttcttattca tcataaacac acacagcaca catgcacgca tgtgcgtgag cacacccttt 16500 

acttgataaa taccatgttg aatattttag tctttccttt taggttctat cccttcactc 16560 

aaaatgcggt tataaataaa tgtacttttc atgtgccttc tgcctaaacc cactttaata 16620 

taactttaca gtcccattat cattatagtc tcaaagctag actcagcctg aaactaccct 16680 
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ttcatttgga acccttatta aaatgccaca tacagctcct tcaaataaaa acaaacccca 16740 

ggacctgaca ctaggcttcc tttgttgcta ctcataatgg ccaagttctg tgcttataat 16800 

acatcccctc tcattttatt gctacatacc caagggtttt atatgttttt cttattatat 16860 

ctcaactcaa aacaccatca cgctcttttc cagatgaaaa taaggaaaag aaattgagca 16920 

actgactgac ttaaaggtca taaaactata tagtagcaga gtcagcaaaa gaagaaacac 16980 

acatctccca agtagaggct gaaaaccagt accattcacc tccagggtga gctatataca 17040 

gattacaaag tcaccttctc taaatgttca aactgaatcc catacccata ctttaccact 17100 

acctcgtaag aacagcctca gatcttgtta tagccttttt tttagcatgc tgaagccaat 17160 

aaaatgcttc ccactcagca agagaaacaa gttctgaaac actgaataat ctgcccaggg 17220 

cctatgaaca tctccactgt gagaaatgtt ctccactgtg tggagaagat ccttactctt 17280 

ctccacacag gcagaacatt agaaaaattc ttggattcta tgatgcacag cttaggagtc 17340 

tgtttagcac aatttaagtc caaatagtta ttaaatcctc ctctgttcca gaaacagtgc 17400 

taaatactgt gaatataaaa attgaaaaga tactctcctg gctcccaaga aagtcagcca 17460 

gatagaggag acacaggcac acaaatcact gtcacatgaa gctctacctc cctaacttca 17520 

aacgagggcc taagtcacca agaatacagt agcagttgtg actacgagta actactataa 17580 

ttcaatactt tatcttccct tagaaaactc ttctcccttg gaaatttatt tgcatttcta 17640 

aacaccactc cctaccaaaa ggaagcaggg ctccttgggg aaatagctga ttctaggtgt 17700 

ggactatgaa atgaaaatgg tgagtctggg acatcccatg ttgcccagaa atcaaggaac 17760 

tgcccaaaga ttaacagagt catgttaaat ggacctaaga gtgaaccaga aggagctcac 1782 0 

tttgccccgc gtggaacaat ttcaagaaaa acatgacagt aatgaattat aaaacatgaa 178 80 

ttaaaataca tattggtact aaaaagagaa caaaaggatg tggctttgga taaagctctt 17940 

cttcatggaa gaataccagc taataaatgt aaaggaaatg agagaattag aaaaattatc 18000 

attttgtaaa ccttaatata ttcacctaga catgctaaaa ccactgagta aaaggctgct 18060 

tgggaagagg atgctcacat gatctcagag tttcacacca cagataattt attagataca 18120 

ggaaggaaga tgtgatcaag cttcctgtga cccccagcca ggccccacaa cactatgtgc 18180 

ctccttgtga tgtgggagct acacagcatc gcccacacag cttctcgcca aaactgettg 18240 

aagctaat ca caagggaaga actggacagc ttctgaccat gagacgctcc accagacaac 18300 

tcgcttggcc cctccaaaga aacttgcttg gcctctccaa agaaaactca gtttcattta 18360 

aaaacaaaac taattattta aaaacaaacg aaaagcaagt tgtggacttg agctccaggg 18420 

acagagcaga catacttttc cctgttcttc ccagtaagtg gtaataaaaa ccctcaacac 18480 

tagatataaa acaaatataa gaaggttctg gaaggggaag aggaggcaga ctatccaggt 18540 

gccttgaggc ccacagaaca acccagtgat gggttcactg ggtcttcttt ttgcttcatt 18600 

atctcagact tggagctgaa gcagcaggca acttcaaaac accaaggggc acagattgaa 18660 

aagccccaag aaaagcctgc cctctctagc caaaggacca ggaaggagac agtctaatga 18720 

gatggaacac atttagacag taactgccca tttaccagca ataactgagc agggagccta 18780 

gacttccagt cttgtgagga cgtaccaagg tacccaacac ccccaccaag gctgagtaag 18840 

gactgcgacc tttatccctg catggcagta gtaaggagcc catccctcac ccgccagcag 18900 

tgtcagggga acctggactt ccactcccac ccaggagtga tgaggccctc cctgctgggg 18 960 

tcatgtcaga ggaggcctag tggagattca gtgacttaac cttttcccag agataatgag 19020 

gccacctttc ctccctcttc ccccatggtg acagtgaaag cactgtggca agcagtaggc 19080 

actcctaccc ctcctagcca gggaggtatc agggaggcca agtagggaac cagaataccc 19140 

acaaccaccc agcagcaaca ggggtccccc accccattgg gtgtcaatgg aagcagagcg 192 00 

gaaagcctgg atatttaccc ccatctagaa gtaacaagct gatgtccccc ttcttctact 19260 

acaatggtgt tcaaaacagg tttaaataag gtctagagtc tgataacgta atacccaaat 19320 

cgttgaagtt ttcattgagg atcatttata ccaagagtca ggaagatccc aaactgaaag 19380 

agagaaaaga caattgacag acactagcac taagagagca cagatattag aactacctga 19440 

aaggatgtta aagcacatat cataagcctc aacaggctgg gcgcggtggc tcacgcctgt 19500 

aaccccagca ctttgggagg ccgaggcagg tggatcacaa gatcaggaga tcgagaccat 19560 

cctggctaac acggtgaaac cccgtctcta ctaaaaatac aaaaaaaaat agcaaggcat 19620 

ggtggtgggc acctgtagtc ccagctactc gggagcctga ggcaggagaa tggcatgaac 19680 

ctgggaagag gagcagtgag ccgagatcgc accaccgcac tccagcctgg gcaacagagc 19740 

aagacttcgt cccaaaaaaa aaaaaaaaaa aaaaaaaagc ctcaacaaac aactacaaac 19800 

gtgcttgaaa caaatgaaaa aaaaatcttg gcaaagaaat aaaagatata tattttggcc 19860 

aggtgcagtg gctcacagcc tgtaatccct gcactttggg aggctgaggc aggcggatca 19920 

cctgaggtca ggagtttgag accagcctga ccaacatgga gaaaccccgt ctctactaaa 19980 

aatacaaaat tagccagtca tggtggcaca tgcctgtaat cctagctact caggaggccg 2004 0 

aggcaggaga atcgcttgaa ctcaggaggt ggaggttgcg gtgagccgag atcccgccat 20100 

tgcacattgc actccagcct gggcaacaag agcaaaactc catctcaaaa aaatagatac 20160 

atattttaat ggaaatttta gaattgaaaa atacagtaac caaattgaat ggaaagacaa 20220 

catagaatgg agggggcaga caaaataatc agtgaacttc aacagaaaat aatagaaatt 20280 
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acccaatatg 

agcagcagga 

ggagtgaggg 

atcagggtcc 

ctgaaaactt 

aaacaaaaag 

gaaaagagaa 

caacccaagt 

tt ttcaagcg 

cttcaagaat 

gccagcagat 

atgaaagaag 

tgtaattaca 

gcaaaaat ta 

gggaaagtat 

acatacacac 

atat atacat 

aaaaacaat t 

gacaagacaa 

aatggtagac 

ccaatcaaaa 

tgggctgagt 

gatcacctga 

ctaaaaatac 

ggaggctgcg 

tgcgccactc 

aacctgcttt 

aaaacat taa 

aagaaaat tt 

caaatcaatc 

gacaaaaaaa 

agacatttac 

ctgaacatat 

aggcttggat 

taggatttct 

actctacaaa 

atgtggatta 

cagagtgtgc 

ctccaaatac 

catttctgat 

actgactaaa 

ggaggccgag 

gaaaccccgt 

ggtcccagct 

gcagtgagcc 

aaaaaaaaaa 

aaatttataa 

cactcccatc 

aagaaaatat 

gaaacaaagt 

caaacaaaaa 

caactctaca 

caaattacca 

gaaaattgaa 

aaatattaga 

agctaaagag 

aataaggatt 

tcactgaaga 

ttgagaaaac 

tgatactaga 



aagaacagaa agaaaataga 
ggaatgatgg aaaaagagaa 
agaaagtctc aaagacctct 
aggaaagaga caaagatggc 
cccaaatttg gcaagagaca 
cccaataaaa tccacaccaa 
aacgccttga aagcagtgag 
aacagatttc ttacagaaat 
ctgaaagaaa agaagtgtca 
caatgggaaa tcaagacagt 
ctcccctaaa ggaatggcaa 
gaatccagaa acatcaagaa 
ataaaatttc tatctcctct 
taaccctgtc tgaagtgctt 
aggtttctat acctcattga 
acacacgtaa gtatatataa 
ataatgtaat acagcaacca 
tagataaatt gaaatggaat 
aaagagaaaa aaagaggagg 
ttaagcccta acttatcaat 
gacagagata gcagagttaa 
gcagtgactc acacttgtaa 
ggtcaggagt tccagaccag 
aaaaaaatta gccaggcatg 
acacaagaac tgcttgaacc 
cagcctgaac gacagagtga 
aaatatacca acatatgttg 
tcaaaagaaa ggagtggcta 
caagagacag gaataaaagg 
attctgcttg gagattcaac 
tcagcatgga gttgagaaga 
ggaacactct acccaacaat 
ccttagaccc taccctgggc 
ggacagtgga agagctgcat 
ttttgggata atgaaaatgt 
tataaaaaag gccattgaat 
tatctaacgc tttttaaaaa 
tctactggaa tcaaactaga 
ttgaaaactg gacagcacat 
attcattttt attgtttaat 
aatgaatatg gctgggtgcg 
gctggtggat cacaagatca 
ctcaactaaa aaactacaaa 
acttgggagg ctgaggtagg 
aagattgtgc cactgcacgc 
aaaaagaata tcaaaatttg 
cactaaatgt ttacattaga 
tcaagaacac agaagatgaa 
aaaaataaat cagtaaaatt 
actgattctt cgaaagatta 
agaaagaaga cacggattac 
cattataaat ttgacaatgt 
caactcaccc aatatgaaat 
tttgtaattt taacactctt 
taaggtaatt atacccttcc 
atgtatgtac tgtgaaaaat 
taaaaaatgt ttttaactct 
attctaccaa atgtttaatg 
tgaagagaag ggaacatctc 
actgtataag gacagctact 



ctggccaaaa aataaagaag aaaaaagagg 2 034 0 
aggaaggaag gaagggaagg agggagggaa 2 0400 
gagactaaaa taaaagatct aacacttgtc 20460 
acagctggaa acgtattcaa aaaataatag 20520 
taaacctata gattcgaaat gctgaacccc 20580 
aatacatcat agtcaaactt ctgaaaagac 20640 
tgaaacaaca cttcatgtat aagggaaaaa 20700 
taaggaagcc agaaggaaat gacacaatgg 2 076 0 
acacaaaatt ctagattcag taaaaatatc 20820 
ctcagataaa gcaaaataag agaatatgtt 20880 

aaggaagatc atgcaacaga ccaaaaaatg 2 0 940 

gaaagaaata acatagtaag caaaaataca 21000 

taagacttct aaattatatt gatggttgaa 21060 

ctactaaatg tatgcagaga attataaatg 21120 

agtggtaaaa tgacaacact gtgaaaagtt 21180 

atatatgtgt gtatatgtgt gtgtatatat 21240 

ctaacaacac tatacaaaga gataataacc 21300 

tctaaaaaat attcaaatac tctacaggaa 21360 

acaaactaaa ttttttaaaa acataaataa 21420 

aattacataa atgtaaatga tctaattata 21480 

tttaaaaaca tagctataag aaacctgctt 21540 

tcccagcact tcgggaggcc aaggcgggtg 21600 

cctggacaac atggtaatac cccatctcta 21660 

gtggcacacg cctgtagtcc caactactca 21720 

cgggcagcag aggtagcagt gggccaagat 21780 

gactccacct cagttgaaaa acaaaaaaga 21840 

gttgaaatta aaagaataaa atatatcatg 21900 

tattaataac ataaaataga cttcagagaa 21960 

atcaagaaaa gatcctgaaa gaaaagcagg 22020 

accctctctt aacaactgat agaacaacta 22080 

acttaacacc actgaacaac aggatctaat 22140 

agcaaaataa acattctttt caagtattca 22200 

cataaaacaa agctcactag tgattgccga 22260 

ggggagggag aaggtgacag ttaaagagtg 22320 

tccaaaattg attgtggtga tgttggcgca 22380 

tgtacgtttt aagtgggtga aacatatggt 22440 

cttaacacat ttcaaagaat agaagtcata 22500 

aagaggtaac tggaggataa cgagaaaagc 22 56 0 

ttctaaaatc atccgtgggt caaagatatt 22620 

gtatttttaa aaatttctta agggaaataa 22680 

gtggctcacg cctgtgatcc cagcactttg 22740 

ggagttcgag accagcctgg ccaagatggt 22 800 

aagtagccaa gcgcagtggc gggagcctgt 22 860 

agaatcgctt gaacacaggc agcagaggtt 22920 

cagcctgggc gacagagact gcctcaaaaa 22 980 

tgggacatag ttaaagcaat gctgagaggg 2304 0 

aaagagaaaa agtttcaaat caatagtctc 2 3100 

gagcaaaata aacccaaagc aagcaaaaga 2 3160 

gaaaacagaa acacaataaa gaaaatcagt 23220 

ataaaattga caaacctcta gcaaggctaa 23280 

cagttattag aatgaaagca taattagaaa 23340 

agatgaaatg gactaattac tgaaaaaaca 234 00 

agataattgg gatagcctga taactactga 23460 

aaaacagaaa cattaaactt aatattttat 23520 

ttaacaaata aaaacgacaa attattttgc 23580 

atcttcagaa aaatagaact ttgtttgaag 23640 

caagaagcaa atatctgggc ccagatggtt 23700 

aagaattacc accaactcta catagcatct 23760 

ccagttcatt ttatgaagtg ggtgttactc 23 82 0 

cttgacacac tgcctatggg tagctctgct 23 880 
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ctgcaggaac agtcagaaaa aaaaaaaaaa gaagcactgg acaagggcag tataaaaaaa 23 940 

gaaaactggg ccaggtgcag tggctcacac ctgtaatctc agcactttgg gaggctgacg 24 000 

ctggtggatc acctgaggtc aggagtttga gactagcctg gccaacatgg taaaaccctg 24 060 

tctctactaa aatacaaaaa ttagccaggc agggtggtgg ggaaaataaa aaggaaaaaa 24120 

aaacaaaaat aaactgcaga ccaatatcct tcatgagtat agacacaaaa ctccttaaac 24180 

tccttaacaa aatattagca agtagaagca atatataaaa ataattatac accatgatca 24240 

agtgggactt attccagaaa cgcaagtctg gttcaacatt tgaaaacaag gtaacccact 24 3 00 

atatgaacgt actaaagagg aaaactacat aatcacatca atcaatgcag aaaaaagcat 24 360 

ttgccaaaat ccaatatcca ttcatgatac tctaataaga aaaataagaa taaaggggaa 24 420 

attccttgac ttgataaagc ttacaaaaga ctacaaaagc ttacagctaa cctatactta 244 80 

atggtgaaaa actaaatgct ttcccctacg atcaggaaca aagcaaggat gttcactctc 24 54 0 

attgctctta tttaacatag ccctgaagtt ctaacttgtg caaaacgata agaaagggaa 24600 

atgaaagacc tgcagattgg caaagaagaa ataaaactgt tcctgtttgc agatgacatg 24660 

attgtctcat agaaaatgta aagcaactag gggtaggggg gcagtggaga cacgctggtc 24 72 0 

aaaggatacc aaatttcagt taggaggagt aagttcaaga tacctattgc acaacatggt 24780 

aactatactt aatatattgt attcttgaaa atactaaaag agtgggtgtt aagcgttctc 24 84 0 

accacaaaaa tgataactat gtgaagtaat gcatacgtta attagcacaa cgtatattac 24 900 

tccaaaacat catgttgtac atgataaata cacacaattt tatctgtcag tttaaaaaca 24 960 

catgattttg gccaggcaca gtggctcata cctgtaatcc cagcatttta ggaggctgag 25020 

gcgagcagaa aacttgaggt cgggagtttg agaccagaat ggtcaacata gtgaaatccc 25080 

gtctccacta ataatacaaa aattagcagg atgtggtggc gtgcacctgt agacccagct 25140 

acttgggagg ctgaggcacg agaattgctt gaacaaggga ggcagaggtt gcagtgagct 25200 

gggtgccact gcattccagc ctggtgacag agtgagactc catctcaaaa aaaataaaat 25260 

aaagcatgac ttttcttaaa tgcaaagcag ccaagcgcag tggctcatgc ctgtaatccc 25320 

accactttgg gaggccgagg caggcagatc acaaggtcag gagtttgaga ccagcctgac 2 5380 

caacatggtg aaaccccatc tctactaaaa aatatataaa ttagccaggc atgtgtagtc 2 5440 

tcagctactc aggaggctga ggcaggagaa tcacttgaac ccggaggcag aggttgcagt 25500 

gttgagccac cgcactccag cctgggtgag agaacgagac tccgtctcaa aaaaaaaaag 25560 

caaaataacc taattttaaa aacactaaaa ctactaagtg aattcagtaa gtctttagga 25620 

ttcaggatat atgatgaaca tacaaaaatc aattgagctg gacaaaggag gattgtttta 25680 

ggtcagtagt ttgaggctgt aatgcacaat gattgtgcct gtgaatagct gctgtgctcc 25740 

agcctgagca gcataatgag accacatctc tatttaaaaa aaaaaaaatt gtatctctat 25800 

gtactagcaa taagcacatg ggtactaaaa ttaaaaacat aataaatact gtttttaatt 25860 

gcctgaaaaa aatgaaatac ttacatataa atctaacaaa atgtgcagga cttgtgtgct 25920 

gaaaactaca aaacgctgat aaaagaaatc aaagaagact taaatagcgt gaaatatacc 25980 

atgcttatag gttggaaaac ttaatatagt aaagatgcca attttatcca aattattaca 26040 

caggataaca ttattactac caaaatccca gaaaaatttt acatagatat agacaagatc 26100 

atacaaaaat gtatacggaa atatgcaaag gaactagagt agctaaaaca aatttgaaaa 26160 

agaaaaataa agtgggaaga atcagtctat ccagtttcaa gacttacata gctacagtaa 26220 

tcaagactgt gatattgaca gagggacagc tatagatcaa tgcaaccaaa tagagaacta 262 80 

agaaagaagc acacacaaat atgcccaaat gatttctgac aaaggtgtta aaacacttca 26340 

acgggggaag atatgtctct cattaaaggg tgtagagtca ttgcacatct ataggcaaaa 264 00 

agatgaacct gaacctcaca ccctacagaa aaattaactc aaaatgactc aaggactaaa 264 60 

cataagatat acatctataa aacatttaga aaaaggccac gcacggtggc tcacgctcgt 2 6520 

aatcccagca ctttgggagg ccaaggcagg tggatcacct aaggtcagga gtttgagacc 26580 

agccggatca acatggagaa gccccatctc tactaaaaat acaaaattag ctggacgtgg 26640 

tggcacatgc ctgtaatccc agctacttgg gaggctgagg catgagaatc gcttgaaccc 2 6700 

999999caga 99^tgcggtg agccaagatc acaccattgc actccagcct gggcaacaag 26760 

agcaaaactc caactcaaaa aaaaaaaaaa aaaggaaaaa tagaaaatct ttgggatgta 2 6820 

aggcgaggta aagaattctt acacttgatg ccaaactaag atctataagg ccagtcgtgg 26 880 

tggctcatgc ctgtaattcc agcactttgg tcaactagat gaaaggtata tgggaattca 26940 

ctgtattatt ctttcaactt ttctgtaggt ttgacatttt tttagtaaaa aattggggga 27000 

aagacctgac gcagtggctc acacctgtaa tcccagcact ttgggaggcc ggggcaggtg 27060 

gatcacacgg tcaggagttc gagaccagcc tggccaacat ggtgaaaccc cgtctctacc 2 7120 

aaaaatataa aaaattagcc gggtgtcatg gtgcatgcct gtaatcccag ctactgagga 2 7180 

ggctgaggca ggagaatcac ttgaacctgg gaggtggaag ttgcagtgag ccgagattgt 27240 

gccactgcac tccagccttg ggtgacagag cgagactccg tctcaaaaga aaaaaaaaaa 27300 

aaagaatatc aaacgcttac tttagaaact atttaaagga gccagaattt aattgtatta 27360 

gtatttagag caatttttat gctccatggc attgttaaat agagcaacca gctaacaatt 27420 

agtggagttc aacagctgtt aaatttgcta actgtttagg aagagagccc tatcaatatc 27480 
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actgtcattt 

aaggggttta 

gtctcatgtt 

gtcatggggg 

gttcttactc 

tctctcgctt 

tggaagcagc 

tacaaaacca 

ctttatagca 

gcagtcacta 

cctacaatac 

gacatggtag 

ctaggtcagg 

tacaaaaatt 

ggcagggaat 

aactccagcc 

tatatgtgtg 

tatacagaca 

ttctgtggtt 

agagcaaaca 

ccacagaatt 

tccaaataga 

cggagttgaa 

tttgaactgg 

tgaaaaataa 

caccattaat 

agaaaaaata 

tatacaatca 

gatacaccat 

ggaaaactta 

atagaaacag 

gaaaaaaaat 

ggattgaaac 

acaatagcca 

tatgtataca 

caacatggct 

aataccatat 

taaaacagtg 

acagaatttc 

acaacaatac 

gaaaccccgt 

aatcccagct 

gcagtgagct 

tcaaaaaata 

ctcacaccta 

gtttgagacc 

gccaggcatg 

tgcttgaacc 

tgggcgacag 

aaaaataatt 

ggctcatgag 

atgtatctta 

cccagagaaa 

aggctgaaaa 

cagtaaaggt 

aaatgattta 

acatatatag 

caaagctgtt 

taaaaataaa 

gtaatattaa 



gaggctgaca 

accctgttag 

gaaatttgtt 

tggcatatcc 

ctagttccca 

cctctctcac 

ctgaagccct 

tgagcccaat 

agacaaatga 

aacaaataag 

agtatctgaa 

cttatgtctg 

agttcgagac 

agccaggcat 

cacttgaacc 

tgggcaacaa 

tgtgtgtgtg 

cacatatata 

atgataggat 

gttatcagat 

aaagaaaagc 

gaatatcaat 

aggtagaata 

cagaagaaaa 

aaagaaaaaa 

cacattaaca 

ttcaaatgat 

agaaactcaa 

ggtaaaaatg 

taagagaacc 

ttgacctctc 

aaagattcct 

cagggtcttg 

aaaggtagaa 

caatggaatt 

aaaccttgag 

tacttcactt 

gttgccaagg 

agttttgcaa 

aaatatactt 

ctctactaaa 

acttgggagg 

gagattgcgc 

aaaaataaaa 

taatgccagc 

agcctggcca 

gtggcatata 

caggaggcag 

agcaagattc 

aagcaggaaa 

agcacaaaac 

ccacaaaaaa 

caaaagtaga 

caagtgaccc 

tattctgtaa 

aaaagcaatt 

aaatatactt 

actaggctaa 

attttaaaaa 

tagacataat 



ataagcacac 

ggtgttaatg 

ccccagtact 

ctcctgaatg 

caacaactgg 

catgtgatct 

cgccagaagc 

aaaccttttt 

accaagacag 

aacaagaggc 

aagtccagtt 

taatcccagc 

cagcctggcc 

ggtggtggat 

caggaggcag 

agtgagactc 

tgcgcgcgtg 

tatgaagcat 

ggggtatcac 

ttaacagaaa 

gtgattaaaa 

aaaggcatag 

actaaaattt 

atttagtgag 

gaatgaagaa 

tatgcatact 

ggccagtaac 

tgaattccaa 

ctgtaagtca 

tcacttacaa 

atcagaaaca 

atatacgaca 

aagagttatt 

gcaacccaag 

tattcagtat 

aacactatgc 

gtatgaaata 

gctgagggag 

gataaaaaga 

tatactactg 

aatacaaaaa 

ctgaggcagc 

caccgcactc 

aaaatttaaa 

actttgggag 

acatggcaaa 

cttataatcc 

agattgcagt 

tgtctcgaaa 

cgagattgct 

ttttcaaaaa 

aagggctggg 

gaatttgttg 

cagagggtaa 

ctatgacact 

gcataaaata 

gtaatatatt 

agaaattact 

atttaaaaat 

acaaaaatac 



ccaaagctgt 

gtttggatat 

ggaggcgggg 

gtttggtgcc 

ttattaaaaa 

cactggt tec 

agatagtgat 

tctttataaa 

ggggaaatca 

tccagaagtg 

tccaaccaaa 

actt tgggat 

aatatggcaa 

gectgtaate 

aggttgcagt 

cacctcaaaa 

tgtgtatata 

gaaaagaaac 

gggggaagta 

aagactttgg 

aaggaaagga 

aaattataaa 

aaaattcact 

acaaatatac 

aaataaacag 

gagagtaccg 

ttcctagatt 

gtaggataaa 

aaaacagaga 

aagaacatca 

atgaatgata 

aagctgtctt 

tgtacatcca 

ggtccatcga 

taaaaaggaa 

taagtgaaat 

cctagggtag 

ggagtaacgt 

gttctggaga 

aacagtatac 

aattagctgg 

agaattgett 

tagectggge 

aatgattaag 

gecgaggcag 

accctgtctc 

cagctactgg 

gagtcgagat 

aaacaaaaac 

gctgaggagg 

atgtttaatg 

gggcaggaaa 

ccttagaaga 

tctgaattct 

aacaatgeat 

ttatatataa 

tgcaaataac 

acagatagta 

aataattaca 

cacaaaaagg 



acctccttga 

ggtttgtttg 

ccttattgga 

attcttgeag 

cagcctggca 

ccttcccttt 

gccatgcttc 

ttatccagcc 

acttcattaa 

ggaagccaat 

aaatatatat 

getgaggegg 

aaccccgtct 

ccagctactc 

gagecgagat 

aaaaaaaaaa 

cacatacaca 

aaggaagtat 

gacaagggaa 

agtaaccatt 

aagtatcata 

atataataca 

agagaaggtt 

ttcaatagac 

aatctcagca 

gaagcagatg 

tttgttttaa 

tacaaaaaga 

aaatattgaa 

cttataaaag 

acatatttga 

tcaaaaatat 

tgttcatagc 

caaataaata 

tgaaattctg 

aagccagcca 

tcaaattcag 

ggagttattg 

cagatggtgg 

ttaaaaatga 

gtgtggtggc 

gaaaccagaa 

aataagagca 

caggaggeca 

gegatcaett 

tgctaaaaat 

tgagactgag 

cgcgccactg 

aaaaacaaaa 

agaaagatgt 

attaaaatgg 

tgaaggtgaa 

aacaccacag 

cacagaaaat 

attttttcct 

agectattgt 

tgcacaaaag 

aagtaatata 

acaataatat 

gaagaagaca 



ggagcaacat 

gccccaccga 

aggtgtctga 

gaatgagtga 

ctttccccca 

atgeaatgag 

ttgtacagcc 

tcaggtattc 

aataatctat 

acccagagtt 

atacaggecg 

gcagatcacc 

ctactaaaaa 

gggaggctga 

cacgccactg 

tatacatata 

tatatacata 

gaaccatact 

actgeaagtg 

ataaatatgt 

acaatattac 

atggaaattc 

caacactata 

attattcaaa 

aaatgtggca 

agaaagagga 

agcaataacc 

accacaaaca 

agcagctaga 

aaccacaata 

agtgctcaaa 

acatccaaaa 

agcattattc 

aaatgtggta 

acacatgeta 

caaaaggaca 

agatagaaag 

ttgaatgggt 

tgagggtggt 

ttaacatggt 

gggcacctgt 

ggcggaggtt 

aaactccgtc 

ggcacggtgg 

gagaccagga 

acaaaaatta 

acacgagaat 

aattccagcc 

agcaaaacca 

gcaggaccaa 

taaattttat 

ataaagacat 

gaagttcttc 

tgaagcatag 

ttcttctctg 

tgaacctata 

agagttggaa 

acagggaact 

ggttgggttt 

atagaactac 



27540 

27600 

27660 

27720 

27780 

27840 

27900 

27960 

28020 

28080 

28140 

28200 

28260 

28320 

28380 

28440 

28500 

28560 

28620 

28680 

28740 

28800 

28860 

28920 

28980 

29040 

29100 

29160 

29220 

29280 

29340 

29400 

29460 

29520 

29580 

29640 

29700 

29760 

29820 

29880 

29940 

30000 

30060 

30120 

30180 

30240 

30300 

30360 

30420 

30480 

30540 

30600 

30660 

30720 

30780 

30840 

30900 

30960 

31020 

31080 
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ataggaataa cattttggta tctaactaga attaaattat aaatatgaag tatattctgg 31140 
taagttaaga cacacatgtt aaaccctaga tactaaaaag taactcacat aaatacagta 31200 
aaaaaataaa taaaataatt aaaatgtttg tatcagtttc ctcagggtac agtaacaaac 31260 
taccacaaat tgagtggctt aacacaactt aaatgtattt tctcccagtt ctggaggcta 31320 
aacacctgca atcaaggtga gtacagggcc atgctccctg tgaaggctct aggaaagaat 31380 
cctcccttgt ctcttccagc ttccagtggt tctcagtaac cctaagtgct ccttggcttg 31440 
tagctatatc attcctagca accagaaaga agaaaataat aaagafctatg gcaaaaaata 31500 
atgaaatcaa aaggagaaaa atggaaaaaa ataaataaaa ccaaaagcta gttctttgaa 31560 
aagatcaacc aagttaacaa accttttaac tagactgaca aaaaggaggt aagactcaaa 31620 
ttactagaat cagaaataaa agaggggaca ttactaatga gggattagaa aagaatacta 31680 
cgaacaaatg tgtgccaaca aattagaaaa cctagatgaa atggacaggt tcctaggaca 31740 
acatcaacta ccaaaattta ctcaagaaga aagagacaat ttgaatgagc tataacaagg 31800 
gaagagactg aattgacaac caagaaacta tccacaaaga aaatcccagg cccagaagat 31860 
ttcactgtga aattctttca aacttataaa tataaattaa catcagttct tcacaaactc 31920 
ctccaaaaaa aagaacagat ctctatttac aggcgatacg atctttagaa aatcctaagg 31980 

gaactactaa gacactatga taactgataa acaagttcag caaggctgca ggatagaaaa 3204 0 

ccaatataca aaaatctatt atatttctat acacttgcag tgaacaaccc aaaaatgaga 32100 

Ctaagaaaat aattcaattt acaataacat caaaaagaat aaaaacactc aaaaataaat 32160 

ttattcaagt aagtgcaaaa cttatactct agaagctaca aaacactgtt aaaagaaatt 32220 

aaaggtttac ataaatgaaa aactatccca tgttcatgga tcaaaagact tattactggc 32280 

aatgctctcc aaattgatct ataaattcaa caaaatcctt atcaaaatcc cagatgaggc 32340 

tgggggtggc ggttcatgcc tgtaatccca gcactttggg aggctgaggc acgcagatta 32400 

cctgaggtcg ggagctcgag atcagcctga ccaacatgga gaaaccctat ctcttctaaa 32460 

aatacaaaat tagtcaggcg tggtggcaca tgcctataat cccagctact cgggaagctg 32520 

aggcaggaga atcgcttgaa cccaggaggc agaggttgca gtgagccaag atcgtgccat 32 580 

tgcaccccag cccgggcaac aagagcaaaa ttccacctca aaaaaaaaaa aaaaaaaatc 32640 

ccagatgact tcactgttga aattgaaaag attattctaa aattcacatg gaattgcaag 32700 

accttgagaa tagccaaaac aaacttgaaa aacacgaaca aaatatagga tgactcactt 32760 

gccaattgca aatgttacga cacagcaaca gtaatcaaga ctgtgtggta ctggcaaaag 32 820 

acacatacat acatacatat caatggaata taattgagag tacagaaaca agcctaaaca 32 880 

tctatggtaa gtgcttttct atttttttct tttttttttt cttttttgta gagatagaat 32940 

ctcaccatgt tgcccaggct ggtcttcaac ttctgggctc aagcaatcct cccactgtgg 33000 

cctcccaaag tgctgggata actggcatga gccaccacat ccagcccaga tgattttcaa 33060 

aaaagtcaac aagaccattc ttttcaacaa ataggtctgg gatgatcaga tagtcacatg 33120 

aaaaaaaaaa tgaagttgga ccctccatca cactaaagtg ctgcgattat aggcatcagc 33180 

caccacatcc agcccaaatg attttcaaaa aggtcaacaa gaccattctt ttcaacaaat 33240 

a ggtctggga taatcagata gtcacatgaa aaaaaaaatg aagttggacc ctccatcaca 33 300 

ccatatgcaa aaactaattc aaaaatgaat tgatgactta aacgtaagag ttacgactgt 33360 

aaaactctta gaaggaaaca tacgggtaaa tcttaaagac gttaggtttg acaaagaatt 33420 

cttagacatg acaccaaaag catgaccaac taaggtaaaa tagggtaaat tgtacctacc 33480 

aaaatgaaaa acctttgtgc tggaaaggac accatcaaga aatggaaagc caaaatagcc 33 540 

aaggcaatat taagcaaaaa gaacaaagct ggaggcatca tactacctga cttcaaagca 33600 

acagtaacca aaacagcatg gtactagtag aaaaacagac acatagacca atggaacaga 33660 

ataaagaacc caaaaataaa tccacatatt tatagtcaac tgatttttga caatgacacc 33720 

ccttcaataa atgatactag gaaaactgga tatcgatatg cagaagaata aaactagacc 33780 

cctatctctc accatataga aaaatcaact cagactgaat taaagacttg aatgtaagac 33840 

ccaaaactat aaaactactg gtagaaaaca taaggaaaaa cgcttcagga cattggtcca 3 3 900 

ggcaaagatc ttatggctaa aacctcaaaa acacaggcaa caaaaacaaa aatggaaaaa 33960 

tagcacttta ttaaactaaa aagctcctgc acagcaaagg aaacaacaga atgaaaagac 34020 

aacctgtaga atgggagaaa atatttgcaa actatccatc catcaaggga ctagtatcca 34080 

gaacacacaa gtgactaaaa caactcaaca gcaaaaaagc aaataatctg gtttttatat 3414 0 

gggcaaaaga tctgaataaa cattctcaaa ggaagacata caaatgtcac tatcattctg 34200 

ccagtaccac actgtcttga ttacttgtta gtgtataaat ttttaaattg ggaagtgtga 34260 

gtcatcctac actttgttct tgtttttcaa gtttgttttg gctattctgg gagccttgca 34320 

agtataaaat agccaacaag tatgaaaaaa tgctcaccat cactaatcat cagagaaata 34380 

aaaatcaaga ccactatgag atatcctctc actccagtta gaatggctac tatcaaaaag 3444 0 

acaaaatata atggatgctg gcaaagattt ggagaaaggg gaactcctat acactgtggg 34 500 

tagggatgca aattggtaat ggccattatg gaaaataata ctgaggtttt tcaaaaaact 34560 

gaaaatagaa ctaccatatg atccagcaac cctactactg ggtatttatc caaaggaaag 34620 

aagtcagtat actgaagaaa tatatgcact ctcatgttaa ttgcaacact gttcacaaca 34680 
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gccaagacag 

catatacact 

gcaacatgga 

aaacagtaca 

gaaggggagg 

tcctagtgtt 

caaaaagcta 

gatggatatc 

cactctgtac 

aattaagaca 

aatatttaat 

cccaattcaa 

ggccaataaa 

aaccacaatg 

taataacaaa 

atgtaaagtg 

agagttaccg 

tacatccaca 

gtggtaagaa 

atacaccaga 

gaatctcgga 

ttccatgcat 

gttgggtggg 

accacaaact 

ctctgagatc 

tcccaggcct 

ttttctctgt 

gctatgcaca 

aacttatagt 

tttttatcta 

tagcaggaat 

gggatggtga 

aagctggaag 

ggctggagct 

ctgacttctc 

tggccagaga 

gagacagaga 

gtactatctt 

aattgagaga 

gagaagagaa 

gaatgaaaag 

cacgctgtac 

agtcaaattc 

attaaataaa 

tagcaaacga 

taatattctg 

ttggtctata 

ccaaccccca 

ctgagatgct 

taccaccatc 

agagtctttc 

ctctcacccc 

ccctcttccc 

gctcccaacc 

agtaattgcc 

ttgtgtcttg 

ttcattagca 

ttaagtctta 

cagaccaaat 

ggagcagagc 



ggaataaatc 

caatagaata 

tgaacctgga 

tgttctcact 

cttgggaaaa 

ctatagaact 

gaagagattt 

ctaattaccc 

cccataaata 

acccacataa 

atttataata 

aaatgggtaa 

gacacgaaaa 

tagaatgtag 

tgttggtaag 

atgcagccac 

tatgacccag 

taaaaacttg 

cccatatgcc 

atattatctg 

aaccttatgc 

cggaaatgac 

gctgggagga 

ggggagctta 

aaggtgtcag 

ctctccttgg 

gtgtgcccat 

aagtgaagtc 

cattttaatg 

cattgtgcaa 

ttaatatggg 

gccaacccag 

gataaaggag 

gggaccaccc 

ccttcctccc 

cagtgacaag 

aatatggaag 

atttatcttt 

aactgaaaac 

aaatttattc 

ctataatcag 

aacctgaagg 

taagtgcttt 

atggaaactt 

ttctggaatt 

ctgacctcct 

gtttacatct 

gtatccatat 

ttccatgttt 

aaccttggat 

ttgtcattcc 

atggaatttg 

ccttcattta 

cctgctgccc 

tgctccctca 

tccatcacta 

aaatgttatg 

agactatggt 

gaagagacca 

taagtagttc 



taaatgtgca 

ctattcagcc 

ggacattata 

cagacatggg 

gttaacggat 

gtagggcgag 

tggacgttcc 

tgattcaatc 

tgtataatta 

tggaagaaat 

tataaagaac 

aagccttgaa 

gatgctcaac 

acaccacttc 

gatgtgaaaa 

tt tggaaaac 

gaatattcct 

tacatgggca 

catcatctga 

cccatacaag 

taagtgaaag 

cagaataggg 

caggtagtac 

aacatagaaa 

cagagctggt 

ctggcaggtg 

gtccaaattt 

tacttccaaa 

tccgcttttc 

agtttaataa 

aactaattac 

agat tagcaa 

gggctattat 

tagagacact 

acctttcaat 

gaacactgca 

ggtagaaaat 

gtatctccag 

tccaattgaa 

cgcatagagt 

caaagatttg 

cacaatgcat 

tccagaatct 

actaaacttt 

cctagagtaa 

tttgctattt 

acgggcttat 

actgctctct 

ttttttttta 

tatttaagca 

tgctatcagc 

cagatgaagt 

gacatcacct 

aattgtgtgc 

tctgtctccc 

taatctcagc 

tataaccttg 

ttagaacatg 

tgttcattta 

caagggaaca 



ccaacagatg 
attaaagaag 
tttaacgaaa 
tgctaaaaag 
aaaaatttac 
tatagttacc 
cagcacaaag 
attacacatt 
ttacgtcaac 
aaaatatctg 
tcctacaact 
tatacactta 
atcactagtc 
atatgcacta 
aatcagaaac 
agtctggcag 
cctgggtcta 
tttatagcaa 
tgaacaggta 
gagtgacatc 
aagccagtca 
aaatctatag 
actactttcc 
ttgatttcct 
tctttctgag 
gccatcttct 
tgattggctc 
agaagggaag 
ctatgagatt 
gaaaaatagr. 
aaggtttagg 
cagtgggacc 
cagagtccac 
gtgcaaagca 
ctcccactag 
aaatgaagtt 
gaatcagagg 
tgcctaatct 
atgaaagaat 
aaacaagaat 
ccagagaaat 
gaaaacgttt 
ctcaagacga 
ccccttgtat 
aatatatttc 
aggatatttg 
actgttcttt 
atcagggtta 
ttttctgcca 
ttcacgattc 
acagaaccca 
tcaaaaggac 
tcttctagaa 
tctcccgtgt 
cacccagaca 
acctagtacc 
caccttaaaa 
gatcagaaac 
catacaacct 
cacggccctg 



aatggataaa 

aatgaaatcc 

caagtaaagc 

aaaatggggt 

agctatgtaa 

aataacttat 

gaatgataaa 

gcatacatgt 

aaaaaaagga 

caaattatat 

caagaacaac 

tctaaagact 

atcagggaaa 

ggatggctag 

ctcattcgct 

ctcctcaaat 

taaccaaaaa 

cattattcat 

aataacatgc 

cagctacatg 

caaatgacca 

agacagaaag 

cagaactact 

cacagttctg 

ggccctgagg 

ccctgcgtct 

attctgggtc 

agggaacact 

gtgaacacac 

attcaagaga 

gcaggactaa 

ccatctacct 

aagccagtgt 

gaaaacaagg 

tgcttcctac 

tgtaggaatc 

ataaagagaa 

gtctctcaaa 

ggagaattac 

ggattcacaa 

taaaaagtgg 

caagaaacga 

Ctatatagct 

taaactaaca 

gtcaaagtgt 

tatacacatc 

ttttcatttt 

ttttaacttt 

catttgaata 

cacgtgtgga 

atctcagctt 

ctttgcatta 

cgtcttacct 

cctggcctgc 

ttaagctgaa 

tagtaggtac 

acaagagaag 

tacagtctgc 

atagcagctt 

caaagcctaa 



gaaaatgtgg 
tgtcatccca 
acaaaaagac 
cacagaatta 
gaagaataag 
tgcacatgtt 
tgtctgtgat 
atcaaattat 
aaaaaaagaa 
atatctgata 
aacaaaacaa 
atatacaatt 
tataaatcaa 
aataaaaagg 
gctgttggga 
tattaaatac 
aatgaaaaca 
aacagcaaag 
ggtattatcc 
ctacaaggat 
cagattatga 
tagattagtg 
ggaacaaagt 
gagactagga 
caaggctctg 
tcacatcatc 
atggccaatt 
gactaggcta 
agaagtaggg 
agcagttcaa 
aaagccagtt 
accacccatg 
cagagtcctt 
gggaaaaacc 
tagccatact 
atctccctct 
aaaaccctga 
aaaggaaagc 
tggactagaa 
aggacgtgat 
taaactcagc 
caagatttga 
accccatttt 
tatgtcctaa 
attgctcttt 
acacgtaaat 
tttaaaattt 
gtaaaatcag 
gcataggagt 
ttttttattc 
tccagctata 
tcctgcctcg 
gacatgccct 
catcctcttt 
tagactggat 
ttaccatgta 
gaagacaaaa 
agcccaaacc 
tcacactaca 
aatatttact 



34740 

34800 

34860 

34920 

34980 

35040 

35100 

35160 

35220 

35280 

35340 

35400 

35460 

35520 

35580 

35640 

35700 

35760 

35820 

35880 

35940 

36000 

36060 

36120 

36180 

36240 

36300 

36360 

36420 

36480 

36540 

36600 

36660 

36720 

36780 

36840 

36900 

36960 

37020 

37080 

37140 

37200 

37260 

37320 

37380 

37440 

37500 

37560 

37620 

37680 

37740 

37800 

37860 

37920 

37980 

38040 

38100 

38160 

38220 

38280 
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ccatagctct 
ac t tcactaa 
cgatccagaa 
gt cagtattc 
tactgtgttc 
catcatcacc 
taaagtcaaa 
gcagcccagg 
acacttttca 
tactaaatgt 
tgactat tea 
gaeytat aac 
tct aaaa tag 
tt at gt caat 
ajz aa iraga 
ct atajt ccc 
ccat cct ggc 
gt arjt a=it gy 

ct ga-jt 'Maa 
gt aa.it .taat 
tgcaaa r^tc 
tttai ri-jgc 
cagt g.*.*acc 
tatt atatat 
gtgeanrcac 
cacgaccagg 
aggag.i ;atg 
tctacaggay 
agagat i^gt 
teat gt aaat 
ggtt tgaacc 
acagcaagac 
aaggat gaag 
ttccttat ga 
catcacacgc 
cagtaggctg 
gcaggcaatc 
gtattatatg 
aaacggaatg 
catgggccag 
ctgaggtcta 
gcactttggg 
aacacagtga 
gcctgtagtt 
ggttgcaggg 
gtctcaaaaa 
actagataca 
tgaatgaaat 
gaactcagac 
cagaataaaa 
cttccttcta 
tgcagggaga 
tcttgtggaa 
aagccagtga 
gggaagaaag 
gaagtggcag 
gaccatgtgt 
caatctcaac 
agttcaaaag 



tcacagaaaa 
accatagttt 
aaggttgaaa 
tgctgccatg 
teaatgeega 
atgcttgttt 
tgacactagt 
ccctagcaac 
ccctttcaag 
tgtcaacaga 
ggttatagaa 
atattaggag 
tctatattgg 
ggaaactcaa 
cagatgeaaa 
agcactttgg 
taacatggtg 
gcaccagtag 
eggagattge 
ctccatctca 
aaaaagagag 
agggecttgg 
caaggeggge 
cggtctctac 
atatatatca 
ccttgacagc 
cagtcctact 
tacgaggctc 
acgagatgga 
ggtgagtgac 
taaaaataga 
gcctgtgtcc 
caacccctct 
act tttatga 
ttttctttat 
aaaataaatg 
tcagtagtta 
agttcccctg 
aacctcatta 
aacaaataaa 
tgatgataaa 
aaaccaagga 
aggctgaggc 
aagcccatct 
agctactctg 
agecgagate 
aataaaaaaa 
gectttagag 
tgaaaagect 
aactcaaaca 
ateagctgea 
gtggttcttt 
catggggtat 
gattatacac 
caaagaagee 
accaacatgg 
atctctgagc 
ggatttttta 
tttccagcta 
gatccttgcc 



agttttcaga 
tttgggtttg 
agaatgaatc 
ctgacaccca 
gtccacccac 
atccttaagg 
ggecaggagg 
agcaggagct 
agagactagg 
catgtcaaaa 
ttaaggattc 
aaactatgtg 
attccagttg 
aaagataaca 
taaaaagagg 
gaggecgagg 
aaaccccgtc 
tcccagctac 
agtgagccga 
aaaaatataa 
agactgetaa 
gatggccggg 
ggatcatgag 
taaaagtaca 
gagccttggg 
aatctggcag 
cctgggtcta 
attcagcatt 
caaaatgtgg 
aatcctaaga 
tgcacacaaa 
acttacatgt 
tcttcctcct 
taatccaatt 
ctctagctta 
ttaattgact 
agttttggga 
accccctcat 
gaatagctgt 
ccaacaaatg 
gggctaagaa 
aagggagggc 
gggeggatea 
ctacaaaaaa 
gaggctgagg 
acaccattgc 
ataaaaaaac 
ttagaaaaga 
ttcaaactaa 
ggtaatgtca 
tgtgaagcag 
ccgaaaacat 
ataactatga 
aatgaggcaa 
agtgatgaaa 
atgggggtga 
tggatgatgg 
ttcagctctt 
tattgagcta 
ttttcaaaat 



tccctcgttt 
tttggttttt 
attactgetg 
tccaatagtg 
tccataacca 
tattgectea 
tcaagagaat 
cacccctcag 
aatctggatt 
ggtaaaacta 
ttatccaaca 
cactgtcgaa 
aaacatgggg 
agcatatata 
gaaactgctg 
egggeggate 
tctactgaaa 
tcaggaggtt 
gaccatgcca 
taataattat 
agtctagaaa 
tgcagtggct 
gtcaagagat 
aaaaaatata 
aatccttgtg 
tacttggtta 
aatcccaaag 
actgggagtg 
tggatattaa 
tacagaataa 
gcagtatacg 
ggattttctt 
ccccctcagc 
ccaaggaact 
cattattcta 
gtttatatta 
gtcaaaagtt 
tgttcacggg 
ctatagggag 
cattaacaag 
tgagaatata 
caggcgtgga 
caagattagg 
tacaagaatt 
caggagaatc 
actccagcct 
agagaaaggg 
tgatttgaca 
aacatttaat 
gcgtggtgtt 
tgactagaat 
taataggcac 
cttactgttc 
caaaaactat 
ggccctgtga 
tcagggtggc 
gccactacca 
tcgtgtcatt 
aacttctcac 
aattttgaat 



agaactcttg 
tttggcaaaa 
aaagaatgtg 
tcatgagatg 
tgtccaagca 
catacagcag 
gagtgaggac 
tcactctagc 
tttatgtgaa 
agtaagttca 
cagataccaa 
acatcaacaa 
aaaggacatg 
aaagcattct 
ccgggcacag 
atgaagtcag 
acacaaaaaa 
gaggcaggag 
ctgcactcca 
aattataata 
gctgaatgat 
cacgcctgta 
caagaccatc 
tatatatata 
tgctgctggg 
tattaagtat 
aattctcaca 
ggaatcaacc 
gaccagaatc 
aggctagaac 
cgtgaccctt 
ccacttctgc 
ctactcaaca 
aatgaaaagt 
agaatatggt 
tgggtaaggc 
atacacagat 
tcaactgtat 
aagagaatga 
caaaacaaca 
attaattcaa 
ggctcacgcc 
agtttgagat 
acccaggtgt 
acttgaaccc 
gggtgacaga 
aggaaactag 
atctaagccc 
tacaccatct 
ttatatcacc 
gaagaaaagg 
cagctctatg 
attcctcaag 
ccaataaaac 
gcagagctga 
tccgtgggaa 
tctgtatatg 
cctgctatca 
ctcatggaat 
ggttgagtag 



ttcatatgea 
aggaatgagc 
cacacagtcc 
cagcagctac 
atcttgggaa 
tggctggtca 
aggtgggtag 
caggactgaa 
atatcttgat 
tggggcagat 
ccaaaaagct 
ggggctaatg 
aacaggcaac 
caaattcagt 
tggctcacac 
gagatcgaga 
ttagecagge 
aatggcatga 
gectgggega 
ataataaata 
gccaagcgca 
atcccaccac 
ctggccgaca 
tatattatta 
gaaggtagtg 
aggcacacac 
caagtccata 
tgggtgtcca 
accaagtaac 
atgatgecat 
gaatagcaca 
tacccccaag 
tgaagatgac 
atattttctc 
acataataca 
ttccactcaa 
tttcaactgt 
atacacaaaa 
gagtgggata 
gaggggcttg 
ttcctcacac 
tgtaatccca 
cagcctggcc 
ggtggcacat 
aggaggegga 
gtaagactct 
atccaggctg 
acactcagat 
getgeagaca 
accctcaaca 
ctgettctta 
catgtcaccc 
gaattcccaa 
caeggaaaag 
tggccatttg 
agctggaaga 
gctaattaaa 
gcacagaacc 
ttgcagataa 
tccctctgtg 
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ctccctcact gacaccctct caaggctgct gagcacgtgc catgctacgg ctttctccaa 41940 

catcaggaaa tgttctccac tcagtttcac cttaatacaa atgtgttctc tcttcagaga 42000 

aggcaaaaaa attcatgacc atctgactgg gagaagtcat ttctaggtaa agtgtccacc 42060 

tttttctgag gaacacagga ggaaaatctt acagaaaaga gttaacacag caggcctaag 42120 

actgcttttt aaaataaata aataaataaa taaataaata aataaataaa taaataaata 42180 

aataaatgaa tgatagggtc ttctgtattg gccaggctag tctcaaattc ctggcttcaa 42240 

gagatcctcc caccttggtc tcccacagtg ttgggattat agacatgagc cattgtgctt 42300 

ggcccaagac tgttattctt aaaaagtctc ataaaaagca tggttaatcc ttggctggca 4 2360 

cctgggaact tagatttcag aagggttccc accatccaac ctggaaagag ggactcactg 42420 

tgcctaaatt attgtgtggt ttatgctgaa ctcctgcttt tcttcaggta gcgtggaatg 42480 

tggcatgtgc tgggcaaagg gggcctgcat gaccagcccc caataaaaac cctgggtgtt 42540 

gggtctctag tgagtttccc tggtagacag catttcacat gcgttgtcac agctccttcc 42600 

tcggggagtt aagcacatac atcctgtgtg actgcactgg gagaggatgc ttggaagctt 42660 

gtgcctggct tcctttggac ttggccccat gcacctttcc ctttgctgat tgtgctttgt 42720 

atcctttcac tgtaataaat tacagccgtg agtacaccac atgctgagtc ttccaagtga 42780 

arcaccagat ctgagcatgg tcctgggggc ccccaacaca gaaataaatt ataaaagacc 42840 

a*qgactggg catggtggcc catgccggta atctcagcgc tttgggaggc cgaggcagga 42900 

gguccagtta agcccaaaag ttcaaagtta cagtgaccta tgactgcgcc aatgcactcc 42960 

aacctgggag acagagcaag accctgtccc caaaacaata aactaaacac atacttctgc 43020 

cttrcaagtg tcttaaaatt caatggaatg gtagaaacat ttttaaaaca ctaaatcaaa 43080 

rtnaaacctgg aaaacaagag tgccgatggc caactaaaat gtctaggaaa tttctgaaaa 4 314 0 

gtaaaaagta ctcagaacca gattacctga gcaaaccata gcccaataca agcttgggag 43200 

qaqqctgtta tgcagaagga aatggtaaca ggtttccagg aacagacttg taacagcaga 4 3260 

tajiacagca gaggtagaac ctgacaaggt gattacctgg ggaactgcag tctgaatgac 43320 

cag.jactgtt ggacccttcc cctcacatgg aatacacacg ccactcagca gcacaccaca 43380 

rjccct:caac aatcacagga ggcacgctac gcctagtaag acaggaaaaa aggaattctc 43440 

aaaccccgaa gatgaacaca taaagaatca ccaagttttt attcagtatg atgaaacagg 43500 

gacactgaat caacagaaca caaacccaag caaagataat tactagagca catagaagaa 43 560 

attrtttagat attcttggga agacctaagg ggacattata aagagcaagc agttggtatg 43620 

tgacgatctt tgtgatatac caagaaataa aaacacagga tgaagaccag atagagaata 43680 

atgctactat ttgtgcaaaa aaggagaaat ggagaatctg attcatattt gcttgtattt 43740 

gcatgaagaa actttggaag gtacataagt aactaacaac aatggttacc tacttgtaag 43800 

gcgagagaag taagaggaca ggaatggtgg gaacaccttt tgtgtccgga attggtgggt 43860 

tcttggtctg acttggagaa tgaagccgtg gaccctcgcg gtgagcgtaa cagttcttaa 43920 

aggcggtgtg tctggagttt gttccttctg atgtttggat gtgttcggag tttcttcctt 43 980 

ctggtgggtt cgcagtctcg ctgactcagg agtgaagctg cagaccttcg cggcgagtgt 44040 

tacagctctt aagggggcgc atctagagtt gttcgttcct cctggtgagt tcgtggtctc 44100 

gctagcttca ggagtgaagc tgcagacctt cgaggtgtgt gttgcagctc atatagacag 44160 

tgcagaccca aagagtgagc agtaataaga acgcattcca aacatcaaaa ggacaaacct 44220 

tcagcagcgc ggaatgcgac cgcagcacgt taccactctt ggctcgggca gcctgctttt 44280 

attctcttat ctggccacac ccatatcctg ctgattggtc cattttacag agagccgact 44340 

gctccatttt acagagaacc gattggtcca tttttcagag agctgattgg tccattttga 44400 

cagagtgctg attggtgcgt ttacaatccc tgagctagac acagggtgct gactggtgta 44460 

tttacaatcc cttagctaga cataaaggtt ctcaagtccc caccagactc aggagcccag 44520 

ctggcttcac ccagtggatc cggcatcagt gccacaggtg gagctgcctg ccagtcccgc 44580 

gccctgcgcc cgcactcctc agccctctgg tggtcgatgg gactgggcgc cgtggagcag 44640 

ggggtggtgc tgtcagggag gctcgggccg cacaggagcc caggaggtgg gggtggctca 44700 

ggcatggcgg gccgcaggtc atgagcgctg ccccgcaggg aggcagctaa ggcccagcga 44760 

gaaatcgggc acagcagctg ctggcccagg tgctaagccc ctcactgcct ggggccgttg 44820 

gggccggctg gccggccgct cccagtgcgg ggcccgccaa gcccacgccc accgggaact 44 880 

cacgctggcc cgcaagcacc gcgtacagcc ccggttcccg cccgcgcctc tccctccaca 4 4 940 

cctccctgca aagctgaggg agctggctcc agccttggcc agcccagaaa ggggctccca 45000 

cagtgcagcg gtgggctgaa gggctcctca agcgcggcca gagtgggcac taaggctgag 45060 

gaggcaccga gagcgagcga ggactgccag cacgctgtca cctctcactt tcatttatgc 45120 

ctttttaata cagtctggtt ttgaacactg attatcttac ctattttttt tttttttttt 45180 

tgagatggag tcgctctctg tcgcccagac tggagtgcag tggtgccatc ctggctcact 4 524 0 

gcaagctccg cctcccgggt tcacaccatt ctcctgcctc aacctcctga gtagctggga 453 00 

ctacaggcaa tcgccaccac gcccagctaa ttttttattt tatttttttt ttagtagaag 45360 

cggagtttca ccatgttagc cagatggtct caatctcctg acctcgtgat ccatccgcct 4 5420 

cggcctccca aagtgctggg attacagacg tgagccactg cgccctgcct atcttaccta 45480 
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tttcaaaagt 
taaccccagc 
tcctggttaa 
gtggtgggca 
tgggaggcag 
agcgagactc 
ataatcaata 
tattagattg 
ataaagagag 
gttgcaggtc 
tgtctcaaaa 
ttatttcaag 
cagaagcaga 
atgtgtatta 
gatccactct 
ctttgggagg 
acatggtgaa 
tctgtaatcc 
aggttgtagt 
tctcaaaaaa 
aagaaaaaaa 
agacgcgtgc 
gttttaacgt 
acattgttta 
aaccaaataa 
agtaaataat 
aacccaaaaa 
tcatggagag 
atgtgtgaaa 
aat taaaacc 
ttggcaagaa 
actgaaaact 
agttccactc 
tactagagtg 
caatcaaagt 
gagaatgaaa 
gggtaaggca 
aaaacctgtc 
ttccagctac 
agtgagcgga 
aaaaaggaaa 
caaaataaat 
ggatagctac 
tgaaattctc 
atgctgggtc 
gggtatatac 
atggcagcat 
atatggataa 
aagaaaattc 
ataagccagt 
gtacctaaaa 
ggtaatggat 
acaatgtgca 
taaaaataat 
attaactaat 
cacagaaaat 
caagtacagc 
tcccaaaagg 
cagcttcaaa 
acccctacaa 



taaactttaa 
actttgggag 
cacagtgaaa 
ccggcagtcc 
agcttgcagt 
cacctcaaaa 
tttaaaaaca 
gtgacctgca 
catttccgct 
acattgaacc 
aaaaaaaaaa 
aaacatgaaa 
aaatagaata 
atctttagaa 
taagccagtg 
ccgaggcagg 
accctgtctg 
caactacttg 
gagccgagat 
aaaatccact 
attgtaaagc 
acttcttcct 
agaatcctat 
aacatgcaag 
tcacttaagg 
cagaaaaatt 
tttaatatgg 
ttcacataaa 
aactaagggt 
acagtatcca 
tgtggagcaa 
gtttgctagt 
ccagatacac 
ttcatgtact 
caaatgtata 
tgaaccagct 
ggcagatcac 
cccactaaaa 
tcgggaggct 
gatcgtgcca 
tcaaaaatat 
ataaatgggc 
tatcaaaaaa 
acgcattgct 
atcaaaaaat 
ccaaataact 
tattcataat 
gcaaaatgtg 
tgacacatgc 
tataaaaaga 
taggcaaatt 
acagagcttc 
cacacttaac 
aaataataaa 
taaacaaaat 
tgaaaatcag 
aatataaaga 
tacaattcac 
aatacaacat 
gaatcataat 



gaagtagaaa 
gccgaggcgg 
ccccgtcgct 
tcgctactgg 
gagccgagat 
aaaaaaaaaa 
ctcaagagat 
aaaccagccc 
gggcacagtg 
acaccattgc 
ttaaattaaa 
gataaatcaa 
gaggcaagga 
tgaaacggac 
tggtgcccaa 
tggatcacct 
tactaaaaat 
ggaggctaag 
cacgccacac 
cctagacaaa 
ttcagagaaa 
agataccagc 
acccagtcaa 
ggttcagaaa 
actcattaag 
tacagtttac 
gacagaatta 
cagattatct 
accaaaacag 
ccagaataac 
ccacatatac 
atctactaaa 
actcaacaga 
tactattcat 
tctatattag 
cggcacagtg 
ttgaggtcag 
acacaaaaat 
gggttgggag 
ctgcactcca 
aaaataagat 
taaagctacc 
agagagagaa 
ggtgagaata 
taaaaataga 
gaaagcaggg 
agctatgatg 
gtgtatacat 
tacaacatgg 
caaatactat 
catagagaca 
aattttgtaa 
actggggaac 
ttttatgtta 
ccagccataa 
tgactagaaa 
gaatgaacaa 
caagaagata 
ttaaagaaaa 
gggagtcttc 



cccgtggcca 
gcggatcacg 
actaaaaata 
ggaggctgag 
agtgccattg 
aaaatagaga 
gggctaaaga 
aaggaacatc 
gtatggcagg 
actccaggcc 
aaagacagaa 
gatattctaa 
aacactcaaa 
taccaaatgc 
gcgcagtggc 
gaggtcagga 
acaaacatta 
gcaggagaat 
tcccagcctg 
taatagttaa 
ataaacatta 
agataaagca 
gaatattcac 
gtttaccatt 
aaaacaaatg 
ctaaataact 
aaatcatgat 
tttaatagca 
tgcaaattca 
taaaaggtaa 
ttctggggta 
accgagcaca 
aatgcacaca 
aatagtccaa 
ggatatatac 
gttcatgcct 
aaatttgaga 
tagccgggca 
aatcgtttga 
gcctggacga 
gacaggaata 
tattaaaaga 
taacagatgt 
taaaatggtt 
agtactactt 
tcttgaagag 
tggaaccaac 
tcaatggaat 
atgaaccttg 
atgaggtact 
aaaagcagaa 
gatgaaaaaa 
tgtaaactta 
ttttaccaca 
gctaatggta 
aagatattcc 
aaaaaaaatt 
caagaattgt 
atatatatta 
aatacaactc 



ggcgtggtgg 
aggtcaggag 
caaaaaacta 
gcaggagaat 
cct tccagcc 
cccggaaagt 
gttgacggaa 
ccagaatgca 
ggaattgcct 
tgggcaacac 
tatttgagag 
ttcccaagta 
acttctccag 
tgagcaggaa 
tcatgcctgt 
gtttgagatc 
gctgggtatg 
cacttgaaac 
ggtgacagag 
attttagaac 
actacaaaga 
atatctccaa 
atggaaaagt 
cacagaatcc 
aaataaaagc 
gtttatgcat 
aagattcttt 
agagaaaaaa 
tttatcatca 
aagacagaaa 
aataagttgg 
tgcacagact 
ctcactcaac 
aaatgcaaac 
aatggcatat 
gtaatctcag 
ctagcctggc 
tagtggttgc 
acccgaaagc 
tagagcaaga 
atccgcaaaa 
caaagatttc 
tagcaaggat 
cagcctctgc 
gatccaacaa 
atatttgtac 
ataaatatcc 
attaattagc 
agggcattac 
atattagata 
tggtggttgc 
ttctggagat 
aaagtagtaa 
atatttatta 
agagtaacaa 
atataaatgc 
aaataagatg 
gaacctttaa 
aacatagaaa 
tccatatcaa 



ctcacgcctg 
atcgagatca 
gccgggcgtg 
ggcgtgaacc 
tgggcgacag 
taaaaatatg 
caaatctaaa 
gcccataaag 
gagtccaaga 
agcaatactc 
aaaaaaatgc 
agaataattc 
tgccatagaa 
gaacaaaaga 
aatcccagca 
agtcaggcca 
gtggtgcaca 
caggaggtgg 
caagattcca 
accaaggaga 
aacgagagtc 
aattcagaag 
gaaataaaaa 
ctgaaaacaa 
accaatgatg 
aatgtatgaa 
tttgctttac 
atgtttagat 
ggaaaatcca 
ttaccaagag 
tgcaaccggt 
acaaccaagc 
aaaagacgtg 
aaccaactgc 
acacagcaat 
cactttgggc 
caacacggtt 
aggcctgtaa 
cggaggtcgc 
ctccgtctca 
gatcagtaat 
acacccataa 
gtatggaaac 
ggaaaacact 
ttctacttct 
acccatgatc 
tttgataaat 
aataaaaatg 
attaaatgaa 
ctcatgcaag 
caggggctgc 
tggttgcata 
atggtaaaaa 
aaagacaaag 
ttaaagaaga 
taacaaaaag 
gctcgtttat 
gcacataaaa 
tagtacaaaa 
caggtcaaac 
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agagaaaaaa aataagttaa ggatgcagaa aacctgaatt accatcaata aacttgagat 49140 

taacatagaa ctgtataccc aatatactaa gagttcaggg aacagtcgtg actgacagtg 49200 

gactgcaaat taatctgttc ttaatctttg tttttctttc agcactgtgg cagaatagag 49260 

atcctaaaaa ccttccagct acaaaacatc tttttaaaaa tataaaaaaa tacaaaaata 49320 

actctgaaat caatagaaga cacatggtga aaccaaaatt ctagaacaca gggagaanaa 4 9380 

aggcattttc agatattaca aaaacagaaa attgatcatt gctgaagtaa tttctaaaga 49440 

atgtacctga gggagaagaa aaatgttcca aagaaaagta tctgtgatac aagaaggaat 4 950 0 

ggaaagtgaa gaaatggtaa acaggtagat aaagctaata aatgttgacc tagaaaataa 4 9560 

caaaaacaat agcaataatg tctcgttgga agggttgaag taaaaataca attaaggcca 49620 

aatgtgaggt aagtggaatg aaagaattag aagtccttgc cttgttcaca ggactgatta 4 9680 

aacaaatgag ccaggttttc cattcaaaca gttaaaactt gaacaaaata aactcaaatt 49740 

aagtagaaag ataaaaaaca gaaattaatg tcatagaaaa ataaaaaatc aatagaatta 4 9800 

atcaataaat cctggttaat aaaagctggt tctttgaaag gattaataaa ataatcatta 4 9860 

agcaagtctg accaaaaaaa aagagaaaag gtaccaaaaa aagtactgta tcagaaagag 4 9920 

aacatacaga tacatacaga tatgtaagag tctgttttct tacaccagaa tactatatac 49980 

aacattatgc tagcatatat taaatttcaa taatgttaat gattttctag gaaaacagaa 50040 

aatattaaat ttactttgaa gaaacagaaa aactgagaaa aataaatgat catgaaaaaa 50100 

acgaaaaggt aattaaatac tgatattaac tgcctaaaca acaccagcag cagcccaggc 50160 

agtctgcagt caagttctgc caaacttgag ggaacagata attcttctat tccagagcat 50220 

agaaaatgat ggaaagtttc ccaatttaat cagagaggac agcctgatcc ttgttatgaa 50280 

cacagataaa aatggggtaa actatatgcc aaactcagat accaaaaccc taaataagat 50340 

gctagcttat tgatgtgaac aatccaaaag tgcattttaa attagcccag ggttttagag 50400 

aaagaaaatc tagcaatgtg accaccactt atgttaacaa ttttaagacg aaaatctaca 50460 

tgatcatatc aacgcatgct acacaaaagc atttgggcaa aaaacccaac acccaccctt 50520 

gacttctcaa actcttagta attaggcata aacagaaatg tacttaatgt gatagaatac 50580 

actcggtgaa gatacagagg gaatgctccc taaaaccaag cccaagacaa agattcctat 50640 

ttaacctcaa tagtcaacac tgcagcgaga gtaatctatg gaagacaagg aaaaaagtaa 50700 

aaacacgaga gacatctgtt gtttaacaga caataagatc acctacttgg aagaggcaaa 50760 

cgaatcaagc gaaaaactat taaaactgag acaggcttta gtatggaggc tcagcttcag 50820 

ctgtagtttg ggctaccaaa ttcaactcgc ttgcttggag agttaatcct gcaaagctaa 50880 

tttctgttga ggtattagga ttgacaagcc tgtgctcctc cctcctcccc catcttcaac 50940 

actgaaacaa cacggtgttt ggaactggat aacagaatct tccaaaaaca aaaattgtcc 51000 

tgaagggccg acttgtgccc ttactcaaaa aacactttat ctgctgcctg cagctcctac 51060 

agttgctggt ggataagcct gccaaccagc tcggcgtaat tcttcctgca gagggcaagg 51120 

aagagcactt tcacaggaaa atttttttcc gaactgtatg ccgcttatta cataaactta 51180 

cgtgctggca aacggagctc cagcaaaata agatatccag agtcaaactt ccttaggaaa 51240 

aaaaaaaaaa aaaagcaagc acataacact aatttccttg catgggcact ggggaaggag 51300 

gtcgttactt ccgcacgccc gcaggtccgc accaccggga aacccacggg caccgcgcgc 51360 

tgcccccggg ccttccaggt gcactgcgcc gcggcgcccc agctgacccg ggatgcgcag 51420 

ccctagccct tcccctgtca ccccggccag gaaggggcgg gagcgcggcg gacgccgagg 514 80 

gcgaagggct tctcggtcct ctgcaccacg cagcaccccc aaggcacaac agggagggtg 51540 

cgggaggctc ccgagaccca ggagccgggg ccgggcgtgc ccgcgcacct gtcccactgc 51600 

ggcgagggct ggggtcgcct ccagggccgc agctgtcggg agccacctgg ctctcagtcc 51660 

cgggtccctg cgacaaccct cgggcccgga ggggaggagg cggccacctg ccgctgccac 51720 

ctgcggcacc ggtcccaccg ctccgggccg ggcaggacag gccaggacgt ccctcctggg 51780 

ctggggacag gacacgcgac gaggggaccg gggcccccgc ggcgaagacg cagcacgcct 51840 

tcccagaaag gcagtcccgt gcccccacga cggactgccg gacccccgcg ctcgcccgcc 51900 

catcccttca gaccacgcgg ctgaggcgca aagagccggc cggcgggcgg gctggcggcg 51960 

cggctagtac tcaccggccc cgctggctca gcgccgccgc aacccccagc ggccacggct 52020 

cc 99gcgctc actgatgctc aggagaggga cccgcgctcc gccggcgcct ccagccatcg 52 080 

ccgccagggg gcgagcgcga gccgcgcggg gctcgctggg agatgtagta cccggaccgc 52140 

cgcctgcgcc gtcctccttc agccggcggc cgggggcccc ctctctccca gctctcagtg 52200 

tctcatctcc ctatctgctc atcctctggt cgcacataat cgatgtttgg gcgtcccaag 52260 

ccagatgtgg accccatttc cgcactctac actggaggtt ttctaagggt ggtgcccgga 52320 

ccagcagctt cagcctcatc tgggaacttg agaaaatgca gattctccgt cccacccagc 52380 

ctattcggtt tttcctgcac taaaaccatg aaggtggggc ccagcagtcc acattctcgc 52440 

aagcccgtca agtgattctg aggcgccctc cagtttgaga gctatgctca cggcctcacc 52500 

tccgccccgc aaggagcccg gtcttgcctg tggcgctagc cgcacacgga cacctcatcc 52560 

tgcggggccc gcccccccgc tgcaccctca ccgcccaacg cctcctccgg gatgcagcgg 52620 

a ggcgcctgg aagtcggcaa ggtcaacatc cccctcagca tcttccctac cctcacggct 52680 
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cctcctccag gggtgcctca tggccagggg ttagaaagag ccactgtgtt tcttgacatg 52740 

gaagtggcct aagaccttaa tgaaaactgc aggagtggaa tgacagaacc tttggtcata 52800 

cttgagggcg tgaagctcaa atgaggagga aggaaaggat ccagggagaa taaccaaccc 52860 

tggcaagttg tggcgcccag gtagaggggc gagcctaggc tagcggttct cgaccagggc 52920 

cggtgttgcc cctcctcgcc gccccgcgta catttgggga ggtctggaga catttttggt 52980 

tgtcatgatg cgggagttgc tactgttgcc taagtgggta gacacgaggg tgctcctcaa 53040 

catcctacct gaaggacagg actgccccac aaggaagaat gatccggccc caaataagaa 53100 

accctgggct ggtcagcaac aacccctttg ttctgagaag agaggaggaa agaataaaag 53160 

aagtggggtg aagttttggt ttggtagagg aaacttgaag acattttcac tggaaaggaa 53220 

gagaggaaga ggagggagat gtctgtaagg acgagcaaac cgggtgacag ctgatttcct 532 80 

catattgaag taatgagtcc tagttataat aaattcctaa taaaaaccca gtttatccct 53340 

gcaataaact tgtctttttt ttttaaatat actgcttgat tctgtttgct aatattttat 53400 

ttacaggctt tgcattgata tgcaaaaatg agatgggcaa taattttctt tttgaatgtc 53460 

taatgttgtt tggtttcaga atcaatgtta tgctcacatc ataaaaaatt tggaaccgag 53520 

gcaggaggag tgcttgaggc cagaagttcg agaccagtct aggaaacaca gtgagacccc 53580 

cccatctcta caaaaaaaaa aaaagaaaaa aaaatgggca tgtttgcttt ttccttttac 53640 

cctgaacaat ttaaggagca ttaaaattat ctattctttg aggtttgatc atttcccagt 53700 

taaaaatgtt cctcccagcc tgatgctttc tttggggagg gtaaatcttt taaggctaga 53760 

aaagtttctt ctgtggcaat tttattattt acattttaaa aattattcta gagttaattt 53820 

tgutaaagca tgtatttctt aaaacaaatt atcctttttt tccagatgtt caagtgtatt 53880 

tgcataaagt tgaggaaagt agtcttttgt gaatctttta acttctccca aatatcttat 53940 

tttgtgcatt tttgcttctt tattttgtta acttttaaaa gtgtattttt ttttcaaaga 54000 

dtcagctctt aggtttatgt ttttggttat actggagctt ttttcttctt ctttttaaaa 54060 

tattttttct cctttatttt ttagacgtat tttgatctaa cgtaatcgga agaaggtaaa 54120 

t tagaatctt ttgttactat tgtgttttta tttctcctta tttctctgaa gtcctgcttt 54180 

ataaatagta ccatgttatt tgtgcataaa tattcatttg tcttatattc ttgggaattt 54240 

tcccacttca tcataaaatg accttccttg tctcatttaa tgtgttcaaa ctttgccctg 54300 

aatttaactt tgtctgatat tttaccatcc tgctgaattt tgtttgttac cccaaacaac 54360 

ctttgctgtt ttcgtctttt ctgaaccctt tattttaggt aatcccttga attagagcac 54420 

taagttttgc tttgtgatta aatctgaaaa tctttatctt gccatagatg agttgagccc 54480 

tattcatgtg acagctatat tatgctgttt catagccctt ttggtccttt tttcactctt 54540 

gcattgcata ttttgtgttt attgtgtttt gtgtttcttc tgataatttg gaaggtttgt 54600 

atttttattc agggagttgc cttataacca tactccgcaa tacacatcgt cctcagtttc 54660 

ttcagactgt ctgttaactc cctattctga ataaaaatga cattgtaatt tccctctttt 54720 

ttctttaccc cttttcttct cctcacctaa tgtaaatgat tttatccttc tttagtattt 54780 

gcctttttaa ttaactacat ttataaatat ctttatcact tgatttttaa atcagctttg 54840 

aatgagatat ttggattcct agatataaaa gatgttaatt ataccatttc cacgttagta 54900 

ggtttataaa atcatacatt ctgctgtgta accataatcc cacgtttgtt ttagttccac 54960 

tcctacagtt aaaagattca gaagtattat taacagttat tttgccatag ttttttcccc 55020 

aacccatttt gtggtaagtt atgatcctgc tttagtttct taagaataat ttatagagca 55080 

gagtgtggtg gctcacgttt gtaatcccag cactttggga gacaagaggt agaaggatcg 55140 

cttgaagcca gcagttcaag accaccctga gcaacatagt gagaccttgt ctctacaaaa 55200 

aattttaaaa tttagccaga cgtagtggcg tgtgcctata gtcccagcta ctcaggaggc 55260 

tgaggcaaga ggattgctag agcccagaag tttgaggctg cagtgacctc tgattgtgcc 55320 

actgcacccc agtctgggca agaaagtgag aacctatctc tttaaaataa caataataac 55380 

ttatgaaaat tatattccct gagtttttca tgtttaaaaa tatttgttgc ctttatcctg 55440 

taaaagtttg agtataaatt cttgggttat actttattta ttgaagaatg tataagtatt 55500 

gtcttctaga attgagtgtt gctgtaatga aaccagaagt cagcctggtt tatttttcct 55560 

cagaaatgag gtaattgccg gccggacacc gtggctcatg cctgtaatcc caacactttg 55620 

ggaggccgag acaggtggat cacgaggtca ggagattgag accatcctgg ctaacatggt 55680 

gaaaccccgg ctctactaaa agtacaaaaa gttagctggg catggtggtg gacgcctgta 5574 0 

atcccagcta cccgggaggc tgaggcagga gaatggcgtg aacctgggag gaggagcttg 55800 

cagagagctg agatcgcgcc actgcactcc agcctgggcg acagagtgag actccgtctc 55860 

aaaaaaacaa aaaaaaaaca aagaagtgaa gtaattgcca tgatgctcca agaattatct 55920 

ctttgtctat gaaatccaga aatctcactg ttatacattt tggaattatt attctgggcc 55980 

aatatttcct gggacacaat agattgactc tatagattta attttttttt tttttttgag 56040 

acagagtctc actgcaatct cagcttactg caacctctgc ctcacgggtt caagcaattc 56100 

tcctgcctca gcctcccaag tagctgggac tacaggcgcg tggcaccatg cctggctaat 56160 

ttttgtcttt ttagtagaga cagggtttca ccatgttggc caggctggtc ttgaacgcct 56220 

aacctcaagt gatccacctg cctcagcctc ccaaagtgct gggattacag gcgtgagcca 56280 
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ccatgcccag 
ttctaatacg 
cccccatctt 
aatctcaagt 
ttttttgttt 
cagctcactg 
tagctgggat 

tggggtttca 

cctcagcctc 

agttttaatc 

gtactcttgg 

ttcttcagac 

gctgtgtcac 

tctcccgggt 

ctgccacttt 

gtggtgcaat 

cagcctcctg 

ttttgtagag 

taatcctcct 

gccttgttct 

aggatactgt 

ctgagattat 

tggattgttg 

gccctgccca 

tacttagctc 

ggtctactgc 

cattccctac 

cactttgcat 

tacttacaaa 

tcacccaggc 

tcaagcgatt 

gcccagctaa 

gtcttgaatt 

ggcgtgagcc 

gaggataagt 

aattagaaag 

atactcactg 

tactcaagtc 

tcacgcctgt 

tcgagaccag 

caggcatagt 

tgaacccagg 

gtgacaaagc 

ttggtgtttt 

tatgtcagct 

ccaagtagct 

agagatggtg 

acctgcttca 

atagaaattt 

gaggctgagg 

agacagaccc 

ttcaacatga 

aaatgtatta 

acacctgtaa 

gagaccatcc 

aggcgcggtg 

ctgaggtcag 

atacaaaaat 

ggcaggagaa 

acactccagc 



cctcaattcc tctttctatc 
ttatttcagt gttcttctaa 
gctagtgagc tcggctggtt 
acctctgtca gcctccacct 
ttgacagagt cttactctgt 
caacctctgt ctcccaggtt 
tacaggcgcg tgccaccaca 
ccatgttggc cagggtggtc 
ccaaagtgct gagattacag 
tgtagtttta ataaagatag 
gtaatttgta agacccccat 
tgtttatttt attttatttt 
ttctggaggc tggagtgcag 
tcaagcaatt ctcctgcctc 
ttaatttttt tagagacaga 
catggctgac tataacctcc 
agtagctggg actacaggca 
acagggtctc catatgttgc 
acctcagcct cccaaattac 
actactttaa tttcatatgt 
aagaatgaaa gaggctgaca 
ttctgggaaa gcaggagata 
aatttggagt ttctatttgc 
gcaatgctac cgttctctcc 
agttccccac cctcccactc 
cgtcttccgt gggctgtttc 
ctgattccag acttggagtc 
ttctgtccct atatcttagt 
gtaaattttg ctgtttttta 
tgtggtgcaa tgacgccatc 
catctgcctc agcctcccaa 
ttttttttat cttttagtag 
cctgacctcg tgatctgccc 
actgtgccca gccaattttg 
ttacagtgct atatgcattc 
aaaatccaaa aaatctcaaa 
acccccaata aaataaaatt 
agagaggaaa gaggaaataa 
aatcccagca ctttgggagg 
cctggccaat atggtgaaac 
gatgtgtgcc tgtaatccag 
gagacgaaga ttgcagtgag 
gagactccat ctcaaaaaaa 
tttttgagac ggagtctcac 
caccgcaacc tccatctcct 
aggattacag gcgcccacca 
tttcaccatg ttggccaggc 
gcctcccaaa gtgctcagat 
caacatgagg ccgggcacaa 
cgtgggagga tcacttgggc 
tgtctctatt tatttgaaaa 
aaagtatctc tcaaaccctt 
ctgtgtgtga atttgcttga 
tcccaacact ctgggagtcc 
tggctaacat ggtgaaaccc 
gctcatgcct gtaatcccag 
gggtttgaga ccagcctggc 
tagctgggcg tggtggtggg 
tcgcttgaac ccgggaggcg 
ctgggcaaca gcctgggtga 



tggtaatttt tctgaagttg aaaacatttg 56340 

gatgtgtaaa gcaccctatt cccaggtcag 56400 

cttcacaaga gctctggttt tctcctgctt 56460 

ggtttatgat tcggagtttt ttggtttttg 56520 

cacccaggct ggagagcagt ggcataatct 56580 

tgagcgattc tcctgcctca gcctaccgag 56640 

cccggctaat ttttgtattt ttagtagaga 56700 

ttgaactcct gacctcaggt aatccacctg 56760 

gcgtgagcca ccgcgcctgg catggtttgg 56820 

tgcttatgtt tgtgtttctt atatttcttg 56880 

atctacacaa gaagtccatt ttcaattcct 56940 

attttatttt tatgtttgag atggagtctc 57000 

tggcgcgatc tcaggtcact gcaacctccg 57060 

agcctcccga gtagctggga ttacaggcac 57120 

gtctcgcttt gttgaccagg ctggagtgcg 57180 

aaatcctggg ctcaagtgat cctcctgcct 5724 0 

catgccacca tgcccagtta attttaattt 57300 

ccaggctggc ctcctactcc tggcctcaag 57360 

taggattata agcatgagcc accatgccca 574 20 

taggtgacca tgtaattgat catccaaacc 57480 

gtagtatgat gctgggacta gcattgtgca 57540 

cggtcaccct acttatagtg tgcttgtctt 57600 

aggcttattt caactgggca gccttgatcc 57660 

accgggtctc tgggacccct tcagtcacta 57720 

cctaaaagcg taaccaggaa tcctgcctca 57780 

agttcctatt acccagagtc aaactcccag 57840 

cagagcttta acctcttcag gccaactccc 57900 

ccatggagat acatttcatg tctttgagtc 57960 

attttttttt tgagatggag tcttgccctg 58020 

tcggctcact gcaacctccg cctcctgggt 58080 

gtagctgtga ttacagacag gcaccaccac 58140 

agacagggtt tcaccatgtt ggccaggctg 58200 

atctcggcct cccaaagtgc tgagattaca 58260 

ctttttttat atttcattgc tatatgttta 58320 

ccaaatatta gaccaaaaaa atctccaaaa 58380 

aaataccaaa aagcaacaat ctcacagacc 58440 

agaaattaac cacaacttaa caaaataaag 58500 

acatcaaaat tacaaagtct aggcggtggc 58560 

ccaaggcggg cagatcacaa ggtcaggaat 58620 

cccgtttcca ctaaaaatac aaaaattagc 58680 

ccacttggga ggctgaggca ggagaatcac 5874 0 

ccaaaatcgt gccactgcac ttcggcctgg 58800 

aaaaaattac aaactcttta gatagaaatt 58860 

tctgtcgcag aggctggagt gcagtgggac 5892 0 

ggattcaagc aattctcctg tctcagcctc 58980 

ccagacccag ctagttttta tatttttagt 59040 

tggtctcaaa ctcctgacct caagtgatcc 59100 

tacaggcgtg agccaccgca ccccacctag 59160 

tggctcacgc ctgtaatctc agcacttcag 59220 

ccaggagttc aggaccagca tgggtgacag 59280 

aaaaaaaaaa aaagagagag agaaagaaat 59340 

cgagatgttg gcaaaaagcg actcaaagga 59400 

aaataagaaa gaggccgggt gtggtggcta 59460 

gaatcaagtg gatcatgagg tcaggagatc 59520 

tgtctctact aaaaatacaa aaaattagct 59580 

cactttggga ggctgaggca ggtggatcac 5 964 0 

ctacatggtg aaacctcgtc tcttctacaa 59700 

tgcctgtaat cccagctact cagaggctga 5 9760 

gaggttgcgg tgagccgaga tcgcaccact 59820 

cacagtgaga ctccatctca aaaaatacaa 59880 
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aaaatcagct 
ggagaatgga 
tccagcctgg 
gaaccctgat 
taat taacaa 
t tgagatgga 
tgcaacccct 
actacaggtg 
caccatattt 
tctcaaagtg 
ctctgtgcaa 
agctcagtgt 
cctgcttcct 
tgctt tctct 
cctcctccat 
agcatt taag 
tgccccaatt 
gccaggcctc 
agctcccctg 
acatat ttag 
tactatccta 
gctgactcca 
gttactccat 
tttattcagt 
tttctcagcc 
ggaggatcat 
tctacaaaaa 
gaggctgagg 
gcaccactgc 
aacttttctc 
cctttgccgg 
tctttttttt 
aggtctcagc 
ccaagtagcc 
agagacaggg 
ccgcctcggc 
tacagtcttt 
atggctttaa 
cggagtctcg 
ctccacctgc 
ggcacccgcc 
atgttagcca 
agtgctggga 
cttgctgtgt 
tatctcaaat 
cctgcagtct 
attcttgact 
caaatccagt 
ctcacctgaa 
cttagcatag 
tccctgctct 
gagtctttcc 
tcactccact 
ctttagaacc 
ggctctctcc 
ggccggacat 
tcacctgagg 
acaaatacaa 
gctgaggcag 
cgccacaaca 



gggtgtggtg 
gcgaacctgg 
gcgacagagc 
aataaagaaa 
aggcagaagt 
gtcttgctct 
gcctcccggg 
cgcgccacct 
gttaggctgg 
ctgggattac 
aaggtcaata 
gtctggagaa 
aaaaatccta 
ccagaaaagc 
ccttagcctc 
agtgaacctc 
ctgcgtcctc 
ccctggagct 
ctcccttgta 
tgatgtttct 
tttgtttctt 
tttatcttct 
gaaatgacct 
ct t tcagcag 
aggcgtgatg 
gagagcccag 
ctaaaaagta 
cagtaggatg 
actccagcct 
agcatattcc 
ttcttcctca 
tttttttgag 
tcatgcaacc 
aggactacag 
ttttactata 
ctcccaaagt 
agacggcctc 
ataccatcgg 
ctcagtcccc 
caagttcaca 
accacgcctg 
ggatggtctc 
ttataggtgt 
gggagttctc 
gggcaatatg 
ccaccatctt 
cttctctatt 
tagctctcat 
tcactgcagc 
tctccacaga 
gctcaaaacc 
agtgacctac 
ccagctctgc 
tttgtatttg 
tgcacttcct 
ggtggctcac 
tcaggagttc 
atagtagcca 
gagaatcgct 
ccccagcctg 



gcctgcgcct 
gaggaggagc 
aagactcttg 
ccaaatgttc 
taaagggagg 
gtcacccagg 
ttcaagcaat 
ggcccagcta 
tctcaaactc 
aggcaggcgc 
aaaagagcaa 
aaaacaatct 
ctatgttgct 
tattcagaca 
agctgctgac 
cgcctccccg 
tcctctcacc 
ctggatccac 
ccatcaatcc 
cccatgtggt 
tccattctct 
cccgttctct 
ctgcactgcc 
catttgacct 
gctcacacct 
gagttcaaga 
gccagtgtga 
acttgagcct 
gagtgacagc 
tctgattctc 
tcctcctgat 
acgcagtctc 
cctgcctcct 
gcacatgcca 
ttggccacgc 
gctgagatta 
tctacctata 
tagactgatg 
caggctggag 
ccattctcct 
gctaattttt 
gatctcctga 
gagccaccgt 
ctcagaactc 
ctcaaaagtc 
aatgtccaat 
acacacccta 
catctcccct 
attctcctca 
gcagtcagag 
ctgtcgtgat 
atgatctgcc 
agctgtcctt 
ctgtcccctc 
tcctgaccac 
gcctgtaatc 
gagaccagcc 
ggtgtagtgg 
tgaacccaga 
ggtgacagag 



gtagtcccag 
ttgcagtgag 
tctcaaaaaa 
aactctcaaa 
atgataaagc 
ctggagtgca 
tctcctgcct 
atttttgtat 
ctgatctcag 
caccgcgcct 
acgtttacaa 
cgcttcagaa 
gttgaccatt 
ttctcctctt 
ctcacttcta 
cacgggcaaa 
atggatggac 
cacctgcagc 
ctcccctcac 
aaaatcactt 
gcaaaacttc 
gctgagtcct 
acatccaatg 
ggccgatcac 
gtaatcccaa 
tcagcctggg 
tggcatgcac 
gggaaatcaa 
gagaccctgt 
ctgctgcttc 
ctcttgacct 
gtctgtcacc 
gggttcaagc 
ccatgcccag 
tggtctcaaa 
caggcatgag 
cttgctcccc 
actcccatat 
tgcagtggcg 
acctcagcct 
ttgtattttt 
cctcgtgatc 
gcccagccga 
catactcata 
aattcctact 
ctaacattag 
tccaatcttt 
gttaccccct 
ctggtctctt 
ggatcctttt 
tcccgtttta 
tattatcacc 
tctgtttcct 
tgtctggaat 
catgtttaaa 
ccagcacttt 
tggccaacat 
cacacacctg 
aggcagagga 
caagacccca 



ctacccggga 
ccgagatccc 
aagaaaaaaa 
gctcggacac 
aatttttttt 
gtgatgcgat 
cagcctcctg 
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aaaaaaatca cacaaacaca cttctcttca tattcctttt ccaagtttta tttttctcca 63540 

gaatacttta cattgtttta atggaagttc tccgtttccc cccaactaga atggatactt 63600 

cctgcaggta ggcactctag tcctcccatc caagtactaa ccaggctcaa ccctgcttag 63660 

cctctgagag caggggagat caggcctgtt cagggtggta tggcccagga attttgattc 63720 

tgttttattc attgctgttc tgttgattct cttttgttcc tcctcccagt gctgagaaca 63780 

ctacttgtac ataataagca ttcaataaat atttgttgaa tgaatgactt gttgaatgaa 6384 0 

ttaatctcag aaatgcagga ctggttctac attagaaaat ttttcaaggt cattctctgt 63900 

tgtcgtaaca cattaagaga ggaaaatttt gtactctaaa tcatttgata aaatacatac 63960 

cgatttctgt cttcaaaaac tcttagtggc tgggcgaggt ggctcacatc tataatccca 64020 

gcattttggg aggacgaggt gggcggatca cttgaggtca ggagtttgag accagcctgg 64080 

ccatcatggt gaaaccctat ctctactgaa aatagaaaaa ttagccgggt gtggtggcgc 6414 0 

atgcctgtag tcccagctac ctgggaggct gaggcaggag aatggcttga acccgggagg 64200 

cggaggttgc agtgagccaa gatcatgcca ttgcactcca gcctgggtaa cagagtgaga 64260 

ctccatctca aaagaaaact cttagtgagt ttaggaatcc aaggaagacc ctcaaactaa 64 32 0 

acagataatc tagctaccag aagccttcag taaaccttaa cactccatgg tgaaacatta 64380 

gaaacattcc cactaaaaga caggctaaga atgcctgcaa tcttcacggc tagtccaaga 64440 

agtcaaaaag aagaaatgag cgctgattta aaaaaataaa caaacaaaaa actaccgatg 64500 

cagaggctgg cagcaaggac tgaaggactg tacagtactt gcctggagca ggcggatggc 6456 0 

cacacccctg cgaagcctgc tcagctggct gggggacgct ccagtgtgtg agtggcagga 6462 0 

tgcagggcac ttcctctgcc agggagttgc actggggaga tcctccccca ctcacacttt 64680 

ggcagctggg gctttggaat gtgacttagc ttctgtcaaa gggtcaatcc accctttgat 64740 

atatgatgca aaggcgaaca tatgatgcaa aggtgagaga acagcccaaa ttaggacttt 64800 

taccacagct gtggaggcgg acagcgacag tggtgggccc tggccagact tttcatgctc 64860 

aaaggtggtg gttgttcttc ctacttcttg tccctccagg gcttcctttg cctgtgtgct 64920 

gaacctgcct cttttaattt tttttaactt ttttaaattt ttaattgttt taattaaaac 64980 

aaattttgaa aactgtctga acctgctttt gaaccctgct atgatttgaa tgtttgtccc 65040 

ctgccaaact gattttgaaa cttaatctcc aaagtggcaa tattgagatg gggctttaag 65100 

cagtgactgg atcatgagag ctctgacctc atgagtggat taatggatta atgagttgtc 6 516 0 

atgggagtgg catcagtggc tttataagag gaagaattaa gacctgagct agcatggtcg 6522 0 

ccccttcacc atttgatatc ttacactgcc taggggctct gcagagagtc cccaccaaca 65280 

agaaggctct caccagatac agctcctcaa ccttgtactt ctcagcctct gtaactgtaa 65340 

gaaataaacg ccttttcttt atgaattacc cagtttcaga tattctgtta taaacaatag 65400 

aaaacgaact aaggcaaact ctcatgattc tactgccatg ccattccaat aaactccctt 65460 

tatgcttaag agagccagag ttggccaggc gtggtgactc acgcctgtaa ttccagcact 65520 

ttgggaggcc gaggcaggtg gatcacaagg tcaggagatc gagaccatcc tggctaacac 6 5580 

ggtgaaaccc cgcctctact aaaaatacaa aaaaattagc tgggcgtggt agtgggtgcc 65640 

tgtagtccca gccactcggg aggctgaagc aggaggagaa tggcgtggac ccaggaggcg 65700 

gagcttgcag tgagtcgaga tcgtgccact gcactccagc ctgggtgaca gaatgagact 65760 

ccgtctcaaa aaaaaagaga gccagagttt atttctgttg cttgcaacca agaaatctgg 65820 

ctggtgcact gaagtttcca taaataatag caatttaaag actctttcca agccaggcaa 65880 

tgcctagcct tgtgtagtcc ttgtggtaat acattcattc attcatttgt tcaaccaact 65940 

gtgctccaga gactaagaat acaaaaatgg gggccgggtg tggtggctca cacctataat 66000 

cctagcacct tgggaggccg aggcaggtag atcacctgag gtcaggagtt cgagaccaac 66060 

ctggccaaaa tggtgaaacc cctactctac taaaaataca aaaaattagc tgggggtggt 66120 

99cggacacc tgtaatccca gctactcgtg agactgaggc aggagaatca cttgaacccg 66180 

ggaggcagag gttgcagtga gccgagatcg caccactgca ctccagcctg ggcaacaaga 66240 

gcgaaactcc acctcgaaaa aaaaaaaaaa aaaaaaagag ggccggggct gggcgcagtg 663 00 

gctcacgcct gtaatcccag cactctggga ggccaaggca ggagaattac gaggtcagca 66360 

gatcgagacc agcctgacca acatggtgaa accccatctc tactaaaaat acaaaaatta 66420 

tccgggcgtg gtggcgcaca cctctagtcc cagctacttg ggaggctgag gcaggagaat 664 80 

cgcttgaacc cgggaggcag aggttgcagt gagccgaaat catgccactg cactccagcc 66540 

tgggtgacag agtgagactc cgtctcaaaa aaaaaataaa aaaaaaaaaa gaattcaaaa 6 6600 

attgtagagt tatagtgtgc ttctagttta gttgagagga catctgtcct tcaaggaagg 66660 

ctagaatcta taccctgagt ccttactgaa atcaatccag cagtcaaaac atgggaccaa 6672 0 

cgatcacagc agtaagatag gaagagcacc tttgtacatt tagctcatgt tgagataagc 66780 

cactgacaga gctgaaggaa gctcacagtt ctgggttcca tcctttggca tttaaaaaga 66840 

aaagtgctaa gaaaattcgg ttggtcacgg tggctcacgc ctgtaatccc aacactttga 66900 

gaggccaagg caggcagatc acgaggtcag gagttcgaaa ccagcctggc caacatggtg 66 960 

aaaccccgtc tctactaaaa acagaaaaat tagccgggca tggtggcgca tgcctataat 67020 

cccagctact caggaggctg aggcaggaga attgcttgaa cccgggaggg ggaggttgca 67080 
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gcgagtgaga gcaggccact gcactccagc ctgggagaca gagcaagact ctgtctcaaa 6714 0 

aaaaaaaaag aaaaaaagaa agaaaggaaa aaaagaaaga aaaaaaaaga aaaaagaaaa 67200 

ttcaggccag gccaggcctg gtggctcaca cctgtaatcc caacactttg ggaggctgaa 67260 

gcgagacggt gccttagccc aggagtttga gaccagcctg agcaacatag cgagaccctg 67320 

tctctataaa aaaaaatttt tttttggcca gacgcagtgg ctcacgcctg taatcccagc 67380 

actttgggag gccgaggcag gtggatcacg aggtcaggag atggagacca tcctggctaa 6744 0 

cacggtgaaa ccccatctct actaaaaaat acaaaaaatt aaccgggcgt ggtggcgggc 6750 0 

gcctgtagtc ccagctactc gggaggctga ggcaggagaa tggcgtgaac ccgggaggcg 67560 

gagcttgcag tgagccgaga ttgcgccact gcactccaga ctgggagaga gtgagactcc 6762 0 

gtctcaaaaa aaaaaaaaaa aaaaaaaaat taattgtcag gtgtgctggc atgcagctgt 676 8 0 

agtcctagct actcgggagg ctgaggtaag aagatcgctt gagcccagga gttcaaggct 67740 

gcagtaatag tgcctctcac tctaccctgg gtgacaatga gaccctctct caaaaagaaa 67800 

gaaaaaaggg aaagaagaaa agaaagaaag aaagagaaga aaggaaggaa gaaagaaaga 67 86 0 

aaaagaaaag gaaggaagga agaagaaaaa aaaagaaaga aagaaaagag agagaagttc 67920 

aaagaccaaa gggtcaggat cccaaaatag tttttatgtt ttatttattt atttacttat 67980 

ttatttttga gacagtatgg ctctgtcgcc caggctggag tgcagtgatg cgattgcggc 68040 

tcactgcagc ctccaaactg ggctcaggtg gccctcccac ctcagcctcc cgagtagctg 68100 

ggaccacagg cgcgtgccac catgcccagc taatttttta attctttgta gagatgaggt 68160 

ctctatatgc tgcccaggct ggtctcgagc tcctgggctt aagccatcca cccgcctggg 68220 

cctcccaaag tgctgggatt acagaagtga gccaccgcgc ctaatcgggt ggtttgtttg 68280 

tttattgacg gggtctcgct gctgcccagg ctggagtgcc agtggctgtt cacaggtgca 6834 0 

gtcctggagc attgcatcag ctcttgggct ctagcgatcc tccagagtag ctgcagctgg 684 0 0 

gattccaggc gcgccaccgc gcggggctca gaatgggttt ttatattgag ggttatgctg 68460 

ccacctagag gatatatgta gtaccgaact gtgtgcgcag ggaggctgag gttgcagtga 6852 0 

gccaagatga tgccagggca ctccagcgtg ggtgacagag caagatttca tctcaaaaaa 68580 

aaaaaaaaaa aaaaaaaaaa aagaattgaa agtaaggtct tgaagagata tttgtgcctg 6864 0 

tatggtcata gcagtattaa ctttgaccca ctagctaaaa cacaaaagca acatgtgtct 68700 

gtcagcaggt gaacggataa acaaaatgtg gtatatatgt acaattgaat attattcagc 68760 

ctttaaaaag gaataaaagg ctggatgcgg gggctcacgc ctgtaaccct aacactttgg 6 8 820 

gagactgagg tgggtggatc acccgaggtt aggagtttga gaacagcctg gccaacatgg 6 8880 

tgaaacttca tctctactaa aaatactaaa attagccggg catggtggca cttgtctgta 68940 

atccaagcta ctggggaggc taaggcagga gaattgcttg aactcaggag ccggaggttg 69000 

cagtgagcta agatggcacc actgcactcc agcctgggca acagagtgag actccatctc 6 9060 

aaaacaaaca aacaaaaaat tattatttcc aaagaaacaa gaccctgggt ccatttccca 69120 

gcccacacct gatgttgact cacaacacac agcctggttt gctatgagcc tgcttcattt 6 9180 

aattgtcacc ttaacttcac atcaccctca agtcctggaa taactctttg ctgacctttg 69240 

tgtgctgagc catctccatg tcgctcaacg tgcagtccct ctcactgcac tgagtcaata 69300 

gccagacgtg gtctgactgc agggtcatcc ttggtggctt aggctgactc gggcatagca 69360 

gggtgctctg agacctcacc gcatataggc tttgccccca ataaactcta tataatattc 69420 

atattatgtg gtctgggtgt gtgtagcttt gcactgtctt ctcgtgacag tgccctcaac 69480 

ctctttccca ggatttcctc ctctacctcc tcaagtccca ctgctctgca aagaccaaaa 69540 

gctgcagagt cccagctccc tcctttacac cccacgacgc agcctcctct ctcagaaccc 6 9600 

tttaaacaga gtcttttact gcagatccca agaacagcca cacccctctc tcccacccac 69660 

tccagacaca cccaggtaat tatagcaccc agggtaacta tgtagatgga gtccctggaa 6 9720 

catgtggata gtgccccctg ggagtatgca aaagcaacat tgctggcacc tgcagagaac 6 9780 

agggtgacat ccaggaatca gagcatgggc ctctgggagg tagggatgtg gccaggcagg 69840 

ctgccaaaaa ttggtagagc aaggccacag gatctttctg accttccttc caaacagagg 6 9900 

ctcctgtact ggtgatccct gtgttgattg accactccct tcctgggggt cgtggtctct 69960 

gtcccagttg cccggacttc tgtgagtgtc ctactgaggt ccttttcatg agaagcatgc 70 02 0 

tgtccttcca cctgctggga gcaagagtga caacttcaat actataatag cagtggcata 70080 

cagagaagaa gaaagatgaa gtggcaagaa aaacaggctt ccaagcagga gtttttctat 70140 

aaaaacaaaa acgtttacaa gcaaactttt tataaagggc tagatagtaa atattttagg 70200 

ctttgagagc cacatagact tgtttgcagg gactcaatgt cgctattgta gtttgaaagc 70260 

agccatcagg gttatgtaaa tgagtgagtc tgattttgtt tcagcaaaat tttatttacc 70320 

aaaacagaca atgagtgggc tggatttggc ccatgatcct tagtttgcca actcctgctt 70380 

tgggctcacc cagatctgat tttgaattct ggctctgcta ctggttagct gcaggagctt 70440 

ggaaggctct ctgagcctgt ttcctcatct gtaaaattaa agcaataatt tctaacactc 70500 

aagagtgtta cctcacgcct gtaatcccag cactttggag gctgaggcag gcggatcacc 70560 

tgaggtcaga agttcaagac cagcgtggcc aacgtggcaa aaccctgtct ctactaaaaa 70620 

atacaaaaag tagccgggca tggtggcgcg catctgtaat cccagctact tgggaggctg 70680 
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aggcagggat actgctagaa cctgggaggt ggagcgtgca gtgagtggag atcacacccc 7074 0 

cacactccag cctggccgac agagcgagac tccatctcaa aaaaaaaaaa aaaaagagtg 70800 

ttagaaggtt ttgagataat gaataaaaga tgccttgtgt atactaagta ttcaacaact 70860 

gatagccgca ttggtctaat tataacagtt tagaagcgat tgagtcaaca aatgctggat 70920 

ttgtcaggga ggacttccta tcaggaggta gatcttgggc tgagtcctga agcaaagata 70980 

ggcattggat agaggagttg agagaacacc ctaggactgt tattattatt attcgacacg 71040 

gagtctcttg ctctgtcacc caggctggag tgcagtggcg cgatctcggc tcactgcaac 71100 

ctctgcctcc caggttcaag cgattctcct gcctcctaag tagctgagac tacaggtgtg 71160 

tgccaccaca cccggctaat ttttatattt ttagtagaga cagagtctca ccatgttggc 71220 

catgctggtc tcgaactcct gacttcaggt gatccacccg cctcagcctc ccaaagtgct 71280 

ggaataacag atgtgagcca ccgcacccag cccagaacca tttttcaatc cttggctctg 71340 

ccttttatta gctgcaagat ctcaggcaat ttatttaacc tctccaaaga ctcattttct 71400 

cattcacaaa atgaggcaaa taataatatc tactatccca ggttgtcatg agaattaaat 71460 

gcaacatgac atttaatgaa atgagaagtc ccttggacat taactggcta aagtatgtgc 71520 

tcgacaagga tatcatttta ggtggatact tagcatctca gaactgatgc tcacaatgga 71580 

atatcattga aacgcattaa aattcatttt aaatgattgt aggtagtgag gcaattgaaa 71640 

gaagaagaca agaggactga ttataatgct tcaggctcac tagtctcctt ttaggaggga 71700 

aaaacaattt caagttaaat tttaggctct agatttttac ccctgctgct cattagaatc 71760 

acccagattg atgaaatcag agcccatctg aggctgtgtt tttcatctcc agaatgagag 71820 

ctgttgtggg gattaagttt ttgaaaaagt acatctaaca ggtgatcgaa aatgatagtg 71880 

atattattgc agtgatggtc attattgttg ttattattat actgaaagag gcttcagttt 71940 

tctgatccat aaagtgaggg aattgcatga gaccattgct aagattcctt ctagctctgt 72000 

ttttttgttt ttgtttttta gacagagtct ctgtcgccca ggctggagtg caatggcatg 72060 

atcttggctc actgcaacct ccgcctcccg ggttcaaatg atcctcctgt cccagcctcc 72120 

gaagtagctg ggactacagg cacacaccac catgcccagc taacttttat atttttaata 72180 

gaggtggggt ttcaccatat tggtcaggct ggtctcaaac tcctgacctc aggtgatcca 72240 

cccgcctcgg cctcccaaca tgctgggatt acaggcatga gccactgtgc ccaacccctt 72300 

ctagctttct tgatcactga ttctagggtt ctctgctgaa atatatttga gacatcctgg 72360 

ataaaagatc atgcaagagc tcccaatatg gtattaataa ttgattctgg aggcttagct 72420 

actcctgatg gattagacat gactcaactg cctctcttat gtgtacaaca caacaacaca 72480 

accaagaaag gttattctgg cattccattt attcagttta tttacagccc ttacttccag 72540 

cagcacgtta aagatatggc cagggccggg tgcagtggct caagtctgta atcccaggac 72600 

tttgggaggc caaggtgggc ggatcacaag gtcaggagtt tgagaatctg gcaattcttc 72660 

agacttagaa gcaaccagct cgataacaca gtcttgtgtg ggctctccct ctgtccctcc 72720 

ctcgcttccc tcatttctca tccctgcccc tgagactgtg caccttcaca tagccctgcc 72780 

atgagacctt catctcaggc tttgctttct ggggtaactg aggctaaaca ctgagtggcc 72840 

ctaaaagagg attgggattc ggaagttaga ttattcacca gagaacagac tttgctgatg 72 900 

atcaggccca ggttgtaatt gttgaaaaaa agagaggatg catagtctta tctcatctcc 72960 

tagtcaaagt caacaccatg ataaataaga gtcaaatcct gagatgtgaa ttggggacat 73 020 

ttgagtggtt aaccctgaga agcttgcacc ttcagacccc tcaatacccc tgctccccag 73080 

agaaggctgg acattgacct cagcacaggc aggagccctg caagatgcca tttgtcctac 73140 

taaagatgga cccctccact ctgtttctag gtaaataacc aaagtcaagt ctccacacag 73200 

cctgagcaag aaagtcagag cctgctacag gagaaaatac cacactggcc aaaggattca 73260 

ctagccctgg ccactgtgtg tgggaggaac cagggaatca tgtgtgggag tcaatgttga 73320 

agctgttgga ctgggggtgg ggtggaatat aagcctggcc ctggggagct tttcccgttt 73380 

gagggccttt acccacaact caagatccag tgctatagca ggagatccca gagctagtcc 73440 

taacagatgg tcaggattga acttggccta gagtaaaatg aggaggatag tgccagaact 73500 

ttctcaacat actattgagg aagaggtcag aaggcttaag gaggtagtgt aactggaaag 73560 

gggtcctgat ccagacccca ggagagggtt cttggacctt gcataagaaa gagttcgaga 73620 

cgagtccacc cagtaaagtg aaagcaattt tattaaagaa gaaacagaaa aatggctact 73680 

ccatagagca gcgacatggg ctgcttaact gagtgttctt atgattattt cttgattcta 7374 0 

tgctaaacaa agggtggatt atttgtgagg tttccaggaa aggggcaggg atttcccaga 73800 

actgatggat ccccccactt ttagaccata tagagtaact tcctgacgtt gccatggcgt 73860 

ttgtaaactg tcatggccct ggagggaatg tcttttagca tgttaatgta ttataatgtg 73920 

tataatgagc agtgaggacg gccagaggtc gctttcatca ccatcttggt tttggtgggt 73980 

tttggccggc ttctttatca catcctgttt tatgagcagg gtctttatga cctataactt 74040 

ctcctgccga cctcctatct cctcctgtga ctaagaatgc agcctagcag gtctcagcct 74100 

cattttacca tggagtcgct ctgattccaa tgcctctgac agcaggaatg ttggaattga 74160 

attactatgc aagacctgag aagccattgg aggacacagc cttcattagg acactggcat 74220 

ctgtgacagg ctgggtggtg gtaattgtct gttggccagt gtggactgtg ggagatgcta 74280 
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ctactgtaag atatgacaag gtttctcttc aaacaggctg atccgcttct tattctctaa 74340 

ttccaagtac caccccccgc ctttcttctc cttttccttc tttctgattt tactacatgc 74400 

ccaggcatgc tacggcccca gctcacattc ctttccttat ttaaaaatgg accggggctg 74460 

ggcgcggtgg ctcatgcctg taatcccagc actttgggag gccgaggcgg gcggatcatg 74 520 

aggtcaggag accgagacca tcctggctaa cacggtgaaa ccccgtctct actaaaaatg 74580 

caaaaacatt agccaggcgt ggttgcaggt gcctgcagtc ccagcggctc aggaggctga 74 64 0 

ggcaggagaa tggcgtgaac ctgggaggtg gaggttgcaa tgagccgaga ttgtgccact 74 700 

gcactccagc ctgggtgaca gagcgagact ccgtctcaaa aaaaaaaaaa aaaaaaaaaa 74760 

tagctgggca tggtggcgcg tgcctgtaat accagctact ctggaggctg aggcaagaga 74 82 0 

atcgcttgaa cccagtaggc ggaagttgca gtgagccgag accttgacac tgcactccag 74 880 

cctggtgaca gagtgagact ctgtctcaaa aaaaaaaaaa agaaaaaaaa agacagaaag 74 940 

aaagagcaca gacagagtca caggtatttg cagtaggaag ctgtcaggtt agagtgcacg 75000 

gaaatagaaa gtatatttta cacttacagc acatcttcgt ttgattagcc acatttaaaa 75060 

tactgaatag caacgtgtgg ctatttagta ttcactaaaa tcttggacag tgcaagtcta 75120 

aagaatcctt gatccgtccg gcatggtggc tcacgccttt aatcccagca ctttgggagg 75180 

ccaaggtgga aggatcactt aaggtcagga gttcgagacc agcctggcca acatggtgaa 7524 0 

acctcgtctc tactaataat acaaaaaaaa ttagccgggc atggtggtgc atgcctgtaa 75300 

tcccaggtac ttgggaggct gaggcaggag aatagcttga atccaggagg cgctgcagtg 753 6 0 

agccgagatc atgccatgcc actactgcac tccagcctgg gcaacagagt gagactgtct 75420 

caaaaaaaaa aaaaaaattg ttgggcgtgg tggctcacgc ctgtaatccc agcactttgg 754 80 

gaggctgagg ggggtggatc acctgggttc tggagttcga gaccagcctg gccaacatgg 7554 0 

tgaaacccca tctctactaa aaatacaaaa attagctggg cgtggtggtg ggcacctgaa 756 00 

atctcagcta ctcaggaggc tgaggcagga gaatttcttg aacccaggag gcagaggttg 75660 

cagtgagcca agatcgcgcc tctgcactcc atcctgggtg gcagagcaag actatgtctc 75720 

aaaaaaaaaa aaaaaaatac ttgattgtct ggacattctg cagaacatca tatggagaca 75780 

ctatgttgac gacatcatgc tgattgtaag caagaaatgg caagtgttcc agaaacacag 7584 0 

tcaagacaca tacatgccag aaggtgagat ataaactcta ctaagattca gtggcctgcc 75900 

acactggtga catctttaaa cctgctagat gtttgtgtag aaaaggattt aaccttgccc 75960 

aaagaggggt ctggcctttg tccccagcta ctggacataa tctctttaaa ctcttgaaat 76020 

atcattcctg atagaagtat ttttgttttg actaggggcc ttgggccagc cagatagcaa 76080 

caatgtgacc tgggttgggg gctttggatc aggtggcatc agtgtgacct cctgagtggc 7614 0 

tagagactag aatcaaccac atgggcagac aacccagctt acatgatgga attccaataa 76200 

agactttgga cacaagggct tgggtaagct ttcctggttg gcaatgctct atactgggaa 76260 

acccattctg actccatagg gagaggacaa ctggatattc tcatttggta cctccctggg 76320 

ctttgcccca tgcatttttc ccttgtctga ttattattat tattatgaga tggaatctcg 76380 

ctctgtcacc caggctggag tgcagtggaa tgatctcaac tcactgcaac ctctgcctcc 76440 

ccggttcaag cgattttcct gtctcggcct cccgagtagc tgggactaca gatgcatacc 76500 

accacacccg gctaattttt ttgtattttt agtagagacg gggtttcacg ttagccagga 76560 

tggtctcgat ctcctgacct catgttccgc ctgcctcggc ctctcaaagt gctaggaata 76620 

catgtgtgag ccaccgcgcc cagccccctt ggctgattat taaagtgtat ccttgagctg 76680 

tagtaaatta taaccgtgaa tataacagct tttagtgagt tttgtgagca cttctagcaa 76740 

attatcaaac ctaaggatag ccttggggac ccctgaactt gcagttggtg tcagaaataa 76800 

gggtgctcat gtgtgtacca tgccctctaa ttttgtagtt aattaacttt cacaacttta 76860 

ttattaccgc ttacactcaa tgtttattca catttatcca cataccactt attctagtgc 76920 

cttgcatcaa agactctcta tctcatgtac tttattctgc ttgaagtaaa tcctttagga 76 980 

tattcttttt tttttttaaa ctttgcacat acatactttt attttttatt tatttttaat 77040 

tttgttattt tcgtgggtac gtagtagata tatgtattta tggagtacat gagatgtttt 77100 

gatacaggca tgcaatgtga aataagcaca tcatggagaa tggggtatcc atcctctcaa 77160 

gcaatttatc cttcaagtta caaacaatcc aattacactc tttaagttat tttaaaatgt 77220 

acatttaatt ttgtattgac tagagtcact ctgttgtgct atcaaatata attttttttt 77280 

tttttgagac agagtctcac tcagtggccc agactgaaag tgcagtggca caagctcggc 7734 0 

tcacttcaat ctctgcctcc ctggttcaag cgaatctcct gcctcagcct cccacatagc 77400 

tgggattaca ggcacacacc accatgccca gctaattttt atattttttt agtagagacg 77460 

ggttttcgcc atgttggcca ggctggtctt gaactcctgg cctcaaatga tctgaccacc 77520 

tcagcctccc aaagtgctag gattacaggc atgagccacc acacctggcc aaaatagaat 77580 

attctttagt gaggtctgct ggtgacaatt tttttctttt ttttgagact gagtctcgct 77640 

gttgtcagct tgggctggag tgcaatagca cgatctcagc tcactgcaac ctccacctcc 77700 

cggattccag caattctcct gcctcagcct cccaagtagc tgagagatta caggcaccca 77760 

ccaccacacg cggctaattt ttgtattttt agtagaaatg ggggttcacc gtgttggcca 77820 

ggctggtctc gaactcctga cctcaggtga tccacccacc ttggcctccc aaagtgctgg 77880 
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tagttttcca gataccagga ctgactccaa ggactattac tcatctggag ggcttagcac 81540 

agtaccgtcg catagtaaat ttccatgtca gttttggtta cctttcatgc acttgcaaac 81600 

atgccatgct ctgaaacgaa ataggcacat cttttttttt ttttttttta aggagcctcc 81660 

ctctcgccca ggctggagtg cagtggcgcg atcttggctc actgcaacct ccacctcccg 81720 

tgttcgagat tctcctgcct cagcctcctg attagctggg actacaggca tgccacgacg 81780 

cccagttaat ttttgtattt ttagtagaga cggggtttcg ccatcttggc caggctggtc 81840 

taactcctga cctcaggtga tctgactgcc tcagcctctc aaagtgttgg gattacaggc 81900 

ataagccact gcatctggcc agaaatgaaa taagtaaatc ttttaacctg ctctaacaat 81960 

atagtgaaaa gaccatatta ttattagagc aggttaaggg atttgcctat ttcgggttct 82020 

agttatagtc ttaaacttgg acattcttgt agaaagtaaa aagtttcctc ttcaaagttc 82080 

cccttcttgt taaagaatac atcataagtg ttagaagtaa tagtttattt taaagactaa 82140 

ctttcttcaa gcctccttgc tttgtgctaa taactctttg ttaagcccta tcctatgtaa 82200 

ctgttggaca tgctcacagg cacgttccag ttcacagcct atgccccttc cttatttgga 82260 

aatgttattg cttccttaaa cctttcggta agcaacttcc tctccttctt cgttcttcct 82320 

tgcac'ttacc tatttagaaa gttttaggct attagcaaat cggctatcag tttaagagtg 82380 

tgaggtcccg ctccagccaa tggatgcagg acatagcagt gaggacgacc caaatgcgta 82440 

agggataaat atgtttgctt ttcctttgtt caggtgtgct ctcgacatcg ttccacctgc 82500 

gattgagcac cctttctgca gaaagtaaag attgccttgc tggagatctt ttgtctccgt 82560 

gctgactttt cttcgtggca ccgattatct atttctaaca attttggtat ttctaacatt 82620 

ctgaacaatc ttgggctagt tgtctcttct gggcctgttt ccccatccgt cacatgataa 82680 

acttcattgg tttaaaaacc ccagcgaaca tttattgagt tactattacc ttcctgccct 82740 

ccccaacccc aaccccaggg agcagttaca acctcagccg ctgagcgcac tcgccgggtg 82 800 

ttaagaagca ccaaagacag ggaggcttga ttgattttgc tttgggagta gagggtcaga 82860 

agattcacag gaaaatggca tttgagcaag gatgattcac tggagctagc ttttaaatac 82920 

tggcgaggct tttatgttgc agtcccttac aaagttgagc attcgcaggg actgcactcc 82980 

gaaataagcc cgcttcccct tttcattcgc taatgatcca gggagctgct ggttccgcat 83040 

gcggcaggtt gtgccttttc ctaatcaggg ttctgcatcg cctcgaaccc gcaggccgtg 83100 

gcgggttctc ctgaggaagc agggactggg gtgcagggtg aagctgctcg tgccggccag 83160 

cgcctgtgag caaaactcaa acggaggagc aggaggggtc gagctggagc gtggcagggt 83220 

tgaccctgcc ttttagaagg gcacaatttg aagggtaccc aggggccgga agccggggac 83280 

ctaaggcccg ccccgttcca gctgctggga gggctcccgc cccagggagt tagttttgca 83340 

gagactgggt ctgcagcgct ccaccggggg ccggcgacag acgccacaaa acagctgcag 8 3400 

gaacggtggc tcgctccagg cacccagggc ccgggaaaga ggcgcgggta gcacgcgcgg 83460 

gtcacgtggg cgatgcgggc gtgcgcccct gcacccgcgg gagggggatg gggaaaaggg 83520 

gcggggccgg cgcttgacct cccgtgaagc ctagcgcggg gaaggaccgg aactccgggc 8 3 580 

gggcggcttg ttgataatat ggcggctgga gctgcctggg catcccgagg aggcggtggg 83640 

gcccactccc ggaagaaggg tcccttttcg cgctagtgca gcggcccctc tggacccgga 83700 

agtccgggcc ggttgctgaa tgaggggagc cgggccctcc ccgcgccagt ccccccgcac 83760 

cctccgtccc gacccgggcc ccgccatgtc cttcttccgg cggaaaggta gctgaggggg 8382 0 

cgccggcggg gagtcaggcc gggcctcagg ggcggcggtg gggcaggtgg gcctgcgagg 83880 

gctttcccca aggcggcagc aaggccttca gcgagcctcg acctcggcgc agatgccccc 83940 

tgagtgcctt gctctgct cc gggactcttc tgggagggag aaggtggcct tcttgcgcga 84 000 

ggtcagagga gtattgtcgc gctggttcag aagcgattgc taaagcccat agaagttcct 84060 

gcctgtttgg ttaagaacag ttcttaggtg ggggttagtt tttttgtgtt tctttgagga 84120 

ccgtggatca agatcaagga aatctcttta gaaccttatt atggaagtct gaagtttcca 84180 

aatgttgagg gttttatgtc taaaagcaac acgtgaaaaa attgttttct tcacccagtg 84240 

ctgtcttcca atttcctctt tggggggagg ggtagttact gctgttacta aaataaaatt 84300 

acttattgct aaagttcccc aacaggaaga ccactacttt tgatgacttt ggcaagtttg 84360 

ctaactactg gaaccctaac ttacaaacga actacttaca tttttgattt ccagttgtat 84420 

tacctgccca atgtttacgt agaaacagct taattttgat tctgggtaac gttgttgcac 84480 

ttcattaaaa atacatatcc gaagtgagca agtatgggtc tgtggacagc agtgattttt 84 54 0 

cctgtcaatt cctgttgctt cagataaaat gtaccagaca gaggccgggc gcggtggctc 84 600 

acgcctgtaa tcccagcact ttgggaggct tggcgggtgg atcacctgag atcgggagtt 84660 

caagaccagc ctgaccaaca tggagaaacc ccgtgtctac taaaaataca aaattagcca 84720 

gggtggtggc gcatgcctgt aatgccagct acttgggagg ctgaagcagg agaatcgctt 84780 

gaacctggga ggcggaggtt gcggtgagcc gagatagcac cactgcactc cagcctgggc 84 84 0 

aaaaagagcg aaactccgtc tcaaaaaaaa agtaccagac agaaatgggt tttgttttct 84 900 

ttttttgttt tgagacggag tttcgctctt gttgcccagg ctcgagtgca atggcgcgat 84960 

ctcagtctcg gctcactgca acctctgtct cccaggttta atcgattctc ctgcctcagc 8502 0 

ctcccaagta gctgggatta cccatgcccc accatgcccg gctaattttt gtatttttag 85080 
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tagaaacggg gcttcaccat gttaggctgg tcttgaaccc ctgacctcaa gtgggcctcc 85140 

cacctcggcc tcccaaagtg ccaggattac aggcatgagc caccgcggcc agccagaaat 852 00 

gggttttgga aaaagcacta aacaaaatcg aacttggttt catatgacag ctctgctgct 85260 

aactgtaaca ggggcagacc agctaaccta cttttctgtc ttctgtcagc tgagaattag 85320 

atgattccca aaggcccatt gaactctgaa tgactttaaa tacttcttct taagtgggta 85380 

cacggttttg gtaactgatg ccaggtgatg aatgcatgaa agtgcttaat gaatgaaacc 85440 

ggtaaaatag taggaggaag ctttattggt aaggcagggg tatacctaat agctctctaa 85500 

tttattggta ttgaagtggt taacttttgt ttttttaagg ggggaaaaca ttctaagaat 85560 

aatgaggcaa actgcatatt gcacaagaga ctgttgtctc tattcaacaa ataccttttg 85620 

agtgtccaga gtctgccagg tgctgtgcca ggccctcacg atcgagtagt gaaccagaga 856 80 

atgtccctgc acccatggag cttattgtct actggggtag acagataata aataagcaaa 85740 

caaatcttct ctcttctccc tttcgctcca tgtaagtgtg tgtgtatagg tgtatactta 85800 

caagttgagt aaagtgttat gaaagattaa gaggagaaat gcattttggt tagatgttag 85860 

aggactcagc aggtgacctt gaaacttaga gctgaaggat cagtaggagg taactagaga 85920 

ggccagggaa tcgcatgttc aaaggccagg aggcaagaaa gagcatggtg cccttcaaga 85 980 

gaggaaagaa ggctactgtg actggagcat agatgtaggc aagtgttggg tgattgagag 86 04 0 

ctctacgggc catggttagg ttttattcct aatgccgaga tgccaaacat ggtggttcat 86100 

atctgtaatc ccagtatttt aggaggccga ggcaggaata tagcttgaac ccaggagttc 86160 

aagaccagcc tgagcaacat gagacctgta caaaacattt aaaaaattgc tgggtatgat 86220 

ggtgcacacc. tgtggtccca gctactcagg aggctgaggc agaaggatca cttgagccta 862 8 0 

9gaggtggag gctacaatga gccatatttg agtcactaca ctccagcctg gatgacaaag 86340 

tgagaccatg tgtcaaacaa aatacagaaa gaatattaat ttaaaatttt gaaagaggag 86400 

tgatctgaac ttatatctta aaaagatcat tctagggcat ggtggctcat gcctgtaatc 86460 

aagggctttg ggaggctgag acaggaggat cacctgaggc cagttcgaga tcaacctgta 86 520 

cagcatagag agactccatc tctacaaaaa gaaaaaataa atagctgggt gttgtgagtt 86580 

attcaggagg ctgaagcaga aagatcactt gagcccagga gtttgaggct gcagtaagct 8664 0 

atgatcccac cactgcaaca cagtgagatc ttgtctcaaa aaaaaaaaaa aatcattcta 86700 

ggtgcttctt ggaggctgga tgtggtaaga gtagaagctg gagatggtcc tgttagggat 86760 

tcgattcaga ctttaaatac catcaatgca ttgagtccca aatttacatc actacgttgg 86820 

atccttgccc ctgaatccag actggtatat ccaactttag gttcagtttg tatctctacc 86880 

tgaccaatat agaggtgtcc agtctttcgg cttccctagg ccacattgga agaagaattg 86940 

tcttgagcca cacatagagt acactaacgc taacaatagc agatgagcta aaaaaaaatc 87000 

gcaaaactta taatgtttta agaaagttta cgaatttgtg ttgggcacat tcagagccat 87060 

cctgggccgc gggatggaca agcttaatcc agtagatacc ttcaacttac aatatctaaa 87120 

attttatgcc agatttagtc attttaaacc tgctcatcag tttttctcaa gaagtagtat 87180 

tttggctttt tttcttttct tttttttgag atggagtctc gctcttatcg ttcaagctgg 87240 

agtgcagtgg cggatcttgg ctcactgcaa cctccgcctc ctgggttcaa gtgattctcc 87300 

tgcctcagcc tcgcaagtag ctggaattac aggcatgcgc caccatgacc agctaatttt 87360 

tggagacagg gtttcaccat gttggtcagg ctggttttgt actcctgacc tcaggtgatc 874 20 

tgcctgcctc ggcctcccaa aggctgggat tacaggcatg agccaccgct cccggctgca 874 80 

tttttggatt tttagttgct cagcccaaaa ctttagtaca tctttgaacc tcttctttcc 87540 

tcctactcta tatctgatcc atcagcaaat ctgttaggtc tacctcacac atatcgaaat 87600 

cctaccacgt ctcaccatct gtgacaatta acaccctggt ctaggcagtc atctctgtta 87660 

agattgagtg gttaaggatg tcctctaagg agatgacatt caaatcttag cttaaatgtc 87720 

aagagggagc tggttttata aagattgagg aggcagcatt attttgccat aggcttccat 87780 

ttggtttcca ttccattctt gatacttatg gtatatattc aaaacaaatg cacagaaaca 87840 

gacccaggta tattgggaat ttcggatata gagttcctag ttgggaaaag atagactgat 87900 

ctgtaaatga tgctagttat ccatcatctg gcaaaaaata atttcctgcc tcctctcata 87960 

tatctcagat caacagactt tttctgttaa gggccaaatc ataaatattt taggctttcc 88020 

agaccatatg gtttctgtca cactctcctt tatccttgaa gccatagaca atatgtaaac 88080 

aaatgggcat ggctgtgcta cgataaaact ttacttacaa aaactggtag tgggccagtt 88140 

taggcatggc cagcactttg ggaggctaag gcagatggat cacttggggt caggagtttg 88200 

agaccagcct ggccaacatg gtgaaaccct gtctctacta aaaatacaaa aaatagctgg 88260 

gcatggtggt gggtgtctat aattccagct actctggagg ctaagacaca agaatcactt 88320 

gaacccagga ggcagaggtt gcagtgagct gagatagcac cactgcactc cagccagggt 88380 

gacggagtct taaagcaaaa caaaacaaaa ggtagtgggt tgtatttggc ccatgggctg 8 844 0 

tagtttgcca atccctgatg cagaaacaaa ttccaggtaa ataagagcct ggaatgttaa 88500 

aaaaacaaaa cttgaagtca tgtagaagaa caggtagggg gaacaatcct gatctcagga 88560 

taggaaggga tattgcttaa aataagacac aggaaaatat aatccatgtt gtgtaaattt 88620 

gactacgtta aaacttaaaa ctttcgccaa gcgcggtggc tcacgcctgt aataccagta 88680 



X'SDOCtP' <WC 01P79P7A? I > 



WO 01/27857 



PCT/US00/28413 



41/122 

ctttgggagg ccgaggtgag cagatcacca ggtcaggaga ttgagaccat cctggctaac 88740 

acggtgaaac cccgtctcta ctaaaaatac aaaacattag ccgggcgtgg tggcgggcgc 88800 

ctgtagtccc agctacttgg gaggctgagg caggagaatg gcctgaaccc gggaggcgaa 88860 

gcttgcagtg agctgagatc gcgccactgc actccagcct gggcgacaga gtgagattcc 88920 

gtctcaaaaa aacaaaacaa aacaaagcaa aaaacctaaa actttcatac aataaagtat 88980 

acctaagata cttctagaag agaagattta catccaggac gtgtatggaa tttctgcaag 89040 

caataagtaa aagacaaggg acacgaagag gcagttcaca aaagaggaag ccaaaatgac 89100 

caataaacat gaaaggatgt ttaacctcaa aggaaacaag gaaatgaatt aaaaacatca 89160 

aatgccactt caaaactagt aagttggcaa aactaaaaat accaaggatg agaatatgaa 89220 

gcatggctat atgagtgcat ggaatggtac agtcactttc attaaaaatg cacacaattt 89280 

gttttttatt tatttttttg agacagtcta tgtcgcccag gctagaatgc agtggcatga 89340 

tctcggctca ccacaatctc tgcctcctgg gttcaagcaa ttctcctgcc tcagcctcct 89400 

gagtaactgg gattacaggc acatgccaca acgcccggtt aagttttgta tttttagtag 89460 

agueaq^gtt ttgccatgtt ggccaggctg gtctcgaact cctgacctca ggtgagctgc 89520 

ttcccaaagt gctgggatta gaggcgtgag ccaatgctcc tggctgaaaa aaatgcacac 89580 

jaf.trtac ctagcaattc catgtctaga ggcttatcct agagaaattc ttgcttatat 89640 

gear .iv ? qajg acgtgtacta gaatgttcac tagttgaatg tttaagtgaa aattaggaaa 89700 

taa,.gtaaat gttcattaac aggaaaatga gtaaaggtat atttataaaa caattaagta 89760 

gcta.Kijtqa ataaactaga gctgcgtgaa tgaactagaa ctggttcaat agtcatgtca 89820 

gattartgaa tgaatacagg tcagatatgt atagagtgtc atttgtgtaa ttaatttttt 89880 

tttttttttt gagatggagt ctcactctgt tgcccaggct ggagtgcagt ggcgtgatct 8 994 0 

caqct caetg caacctccac ctcctgggtt aaagtgattc tcctgcctca gcctcccgag 90000 

tagttq79.it tacaggcatg caccaccatg cccagctcat tttcctattt ttagtggcca 90060 

cagyg-.ttca ccatgttggc caggctggtc ttgaactcct gacctcaagt gttccaccca 90120 

acttrncctc ecaaagtget aggattacag gcgtgagcca ccgtgctcag ecatttgegt 90180 

gatttttaaa gatgtgcaga ataatgecat taaaaaaaat acacatacat gtatatatat 90240 

acacqtttgg ctgggtgtgg tggctcacac ctgtaatccc agcactttgg gaggctgagg 90300 

caggarj.^atc acttgagccc aggtgtacaa gaetagectg ggegagatag caagacccca 90360 

tctcaacaac agaaaggata attaggtatg gtggcatgag aggatcactt gageccagga 90420 

gttcgagtgt tatcaggeca ctgcactcta gcctggacaa caaagcaaga ccgtgtctca 90480 

aaaoaataaa aataaaaagt atttgtatgt ggtcatagtc aaaaaacgta catggaagga 90540 

aaatgtcttt atttatttat ttattttttt ttttttaaga cagagtcttg ctctgtcacc 90600 

caggctgggg tacagtggtg taatctcagc tcaccgcaat cteggcctcc cgggttcaag 90660 

cgattcttct gcctcagcct tctaagtagc tgggactaca ggtacccgcc accacaccct 90720 

gctaartctt gtgttttcag tagagacagg gtttcaccat gttggcaagg ctggtctcga 90780 

actcctgacc ttaagtgagc cacccgcctt ggcctcccaa agtcctggga ttacaggtgt 90840 

gagccactgc gcttggccag gaaatatcta atttagtaag tatttatatc tgggaaagga 90900 

agggtcaggt ggtgattcat aggaactcta aagtctatgt ataatactta gggggacaga 90960 

aggaaataaa geaaaatget gatatttgat tgttgagttg tgtatatgtt agaagtataa 91020 

cataggagat ctgattgata gtaggagaat gtttttaggt ggtaaaagtg gaaccgtggt 91080 

ggtttgtttt ggcagtagaa tcagttggtc atagtttgta tgtggaaggt aataaacaga 91140 

ccatgttaag gatgacttcc ggaattttgg tctgagtagt gggtggatga cagtgtcatt 91200 

catgagggaa gatgaagact gaggtaggaa caggtttggg agaagatgac atgttccctt 91260 

ttagacaagt ggaattatgg aagatggcag gtaggtggtt agctatatga atttgagata 91320 

aaagatttag gatggagata taaatttagg agtaacagcg tatctatggt attgtaagee 91380 

ttaagaatgg gtaggatcag ccaggaaata cagatgtata tgcagaagag aggagtcaag 9144 0 

gaagecaaga caagttaatg tttaaagtga gtgatgtagt ccatgggcag atgctgctga 91500 

gagggctgea aacaccagtg accctacaac atttttaaat gtcgtcttcc tgacagcagt 91560 

gatcagtacc tgeaacgate ttatttattt ttttcatgtt agtctccaca cacttgaatg 91620 

tagacttttt gaaggcaaaa teattgeett ttctgagctg ggagcatgtc tggcacatac 91680 

caagcactca acagttgatg tattgacttc atccagatac tetgagggeg agttatttcc 91740 

tgetactage ctttcacctt tcaatgttta agagcacaaa tacagagatg ggcacgtttt 91800 

ggcatttctt attttgataa ccttttcctg gtaagatttt ttaatgttga aaaaaaaaaa 91860 

caagaaaaga gggttaaaaa tagtcttatg tcagatcctg tgatagaatt cacacttggc 91920 

ttaagecget gggcaccttc ctatcttgga tgtcatatta gcttatctac agcagaattt 91980 

ttactgtttt atgtagtaag gaagcaatta tatgattatt ttacagacaa attattcttt 92040 

atcttttatt tttttagacg gagtctctct ttgtctccca ggctggagta cagtgtcgcg 92100 

atctcggctc actgcaacct ccgcctcctg ggttcaagca attctctgcc tcagcctccc 92160 

aagtagctgg gcttacaggt gtccgccacc acacccagct cattgttttg tatttttagt 92220 

agagatgggg tttcaccatg ttggccaggc tggtcttgag ctactgacct caggtgatcc 92280 
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acccgccttg gcatcccaaa gtgctggaat tacaggcgtg agccaccgtg cctggcccag 9234 0 

acaaattatt atactctgag tgttagaggc ttaggatgtt ttcacttgat gctatgggag 92400 

gaacaagtaa taagatatga tacacaacca aagacctttc ttcacnatgc ttctagtagc 92460 

tagtactatg gatgacacat ggtaataata ttggttagca tttgtcctca atttactgtg 92520 

ctagtcactc ttctaagccc cttacaggta tatatttttt ttcatcaata atcctctaag 92580 

gtagtcttta ttattgacct aattttataa atcaagaaaa ttaagaccca gagaagtaag 92640 

caacttgtcc aagatcacat ggcttataag tggtagagcc agaatttgac cccagatgtt 92700 

gtgactacat tgtctctcca taagcaggtt caactctttt gactggatgc tgttccaagg 92760 

tcacttccct agagaagcct tcgctgacaa ctaccctcct gtgccctcct ccaaggctgt 92820 

ccartgtcct agaactttga atactcatct tagaataaag ctggtctaat ttttacagtg 92880 

ttatagaatg gatctctgac tgcaaaagtt ggtcataatt atctttttat gttctagtga 92940 

aaggcaaaga acaagagaag acctcagatg tgaagtccat taaaggtaag ttctgccctt 93000 

ggcagtccac tgcattaaaa agtgatgtgc tttgcatttg tgagtccttt aatcctgtta 93060 

cactccctct tttggcatta atcatttctg ccttatttta taattactta tgattttgat 93120 

ctatttccct ctttaacctg tataatgctt taacatctag catataataa gtaggctttt 93180 

tttccttttt tttttttgga gacggagtct tgctctgtca cccaggctgg agtgcagtgg 93240 

cgcgacctcg gctcactgca agctctgtct cccgggttca caccattctc ctgcctcagc 93300 

ctccccagca gctgggacta caggtgcacg gcgccacgcc tggctaattt tttgtatttt 93360 

ttagtagaga cagagtctca ccatgttagc cagtatggtc tcgatctcct gaccttgtga 93420 

tccgcccgcc tcggcctccc aaagtgctgg gattacaagc gtgagccacc gcacccggcc 93480 

gtaagtaggc tttttctacc ttaattttat ttttttgaga tggagtcttg ctcttatccc 93540 

caggctggag tgcagtggtg ccatctcggc tcactgcagc atccacctcc cgggttcaag 93600 

cgattctccr gcctcagcct cccgagtagc tgggattaca ggtggccgcc accatgccca 93660 

gctaattctc gcatctttag tagagacagg gtttcaccgt gttggccagg ccagtctcaa 93720 

actcctgacc tcaagtgatc cacccgcctt ggcctcccaa agtcctggga ttacaggcgt 93780 

gagccaccat gcctggccat aagtaggctt ttactgagcc ttgtgtgtat tggctatcct 93840 

agtgattaca gtgaaccagt gcccttctta ttaatcacac atttaattgt tccctaaaag 93900 

tgattagttc acttcattta tttagtaaga caaaaaatga agaatactct taactgagca 93960 

gtctgttaac tgtaggaaag cactgacact tataaggctt agttttctgt catttatcca 94020 

gaagtatggt tgattacagt ttttactttt ttatttgaat gaacaacctt aatttaaaat 94080 

atattttgtt tattttttgt tgggatcgat acattgtcct tgtttataga ttagagcatg 94140 

ctttttaaag atgctgtatt actcactgat tttatttgtc cagtgtacag agattgaagt 94200 

gggaaaatta taatggaaat tgtttccata gtcattacat attaatttca tcaatttatt 94260 

tccataaaat ctgtagattg ctacttattt agatttttcc ttcaaatgtt tttatgttgt 94320 

attgcttgca ctgagtattt attctatatg ctcaatttgc tggagaagaa gactaattat 94380 

aacttaggca agttgtaaaa ttagggaaaa aagtaaggta ccttacagcc tagtttactt 94440 

atttcttacg taaagccagt tagattccac attagttcaa actgccttct ttgagcaaaa 94500 

cttgattggc agtgataaag gcttaaagcc cttctcaagc agagacctgt aaagactaga 94560 

tctgactgta gtagaaggaa ggaacttaga tgtttcaggc agtgagaaca ccagtcttcc 94620 

actctaaact ttgccactaa cagtatgacc ttgggaagtt gtaactttct tcagattctt 94680 

catttgctga atggggggat tggcctagct aatttctaaa tctctactgg gctaaaaaat 94740 

tctgtgctta tactctgatt atgaagtaca taatctgtgc ttaacattca ctgacttatc 94800 

cttaggataa tacagaagca gtacaagaaa cagcccctca agatgtttgc agtctggtta 94860 

gaaagacaaa cttatacaca gaacagtagc aaatagacca aaataataat agctgccatt 94 920 

tatagaacac ttcttctgtt ctgggcatta gacaaaaact gactataacg gtgaacaaaa 94 980 

aagacttagg tcctgccctc attgaactta cagattagta ggggagagga acattaatca 95040 

agtaattcca cagatggctt agcctagatt ggtagtgatg gaagtaaaga gatgtgaacg 95100 

gacttgaaaa aaaattcgga ggcaaaatgg atagaagttt attattgatt aaatatgagg 95160 

tgtgagagag agggatattt aagattgata cctaccttct ggcttgccta acagaaccaa 95220 

aacaggaaat tatatgttca gttttgttat gttgggtggg aggtgctttt gagtcattca 95280 

tttatatatg ttatatatgt tattttatat gcatagtaat tttaaggtct gagttttaaa 95340 

ccaaaggtta gagagtgatt ttttagagtc tagcaaacct aagttgaaat cctgcctgtt 95400 

gaaatggctg tttactagct cattaaccta gggcaaagta ttcaacttgt tttcattttt 95460 

gtcttcatct ctaaaatgag gaaaatatgg tcttacaaga ttgtcctgag agatagatga 95520 

aataatatcc aaaaaaaaaa aaggtacata gagaaactcg tatagtgcct ggtatatagt 95580 

aggtcctcca ttggtagcta tcattatcta gttttaacat agccttcagt ttgttgaatt 95640 

agtcaaactg agtgaagcac tgcaaggaat tcagaggaat ttgagatcaa caaatgattt 95700 

ctgaagttta gggaagactt catggcaatg acacttacct tgtataaaag ttgaagaata 95760 

agaaagattt gaatgagaga ttctttctct tctccctacc agcccagctt cttatttgag 95820 

gatatattgg gcaaaggggc cttcagacaa gtagagggag atttttacag aaagattgag 95880 
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acgaaggtat agaaggctgt 
ccactgtagg tttttgagca 
tctggaagag atttgcagga 
ttgcactgag ggtgaggtgt 
gaaaagtcaa caaatcttgg 
tttgagtttt ggagactggt 
tgatgtttta agttgatatt 
ggaatgctta gtagcgagta 
acagatacag atcagggctt 
aaacagtcca tgtggagggt 
tttgttgttg ttgttgtttt 
cagtggcgag atcgcgcagc 
gcctcagcct cccaagtagc 
tctgtatttt tagtagagat 
gcctcgtgat ctgcctgcct 
cacccggtcc cttgttaagt 
agtttgaaaa ttattgctct 
dgctaagaata aaagaagaac 
crgacacact gggtgttgat 
gag^aaatta ggagagtata 
tggtaaaaat aagttactag 
acaaatgagt tagtagcatt 
tcaccattcc tatagactct 
tdajtaagct aaaaattatt 
tuacartgaa aaaatagttt 
attgaatgac tgaaattatc 
gaciactaaa aatatatttt 
ttgagatgga gtttcactcc 
ctgraaccct cgcctcccag 
gattacaggc acctgccccc 
ttcaccatgt tggtcaggct 
cctcccaaag tactgcgatt 
ctaggttccc cccaccccaa 
aaggtttaag gatttaaaaa 
tatccacagc agaagttctc 
gtaitttgag aactataaat 
ctgtaatcct ggctacttgg 
ggttgcagtg agccgaggtc 
gtctgggaaa aaaaaaaaaa 
taagttactt ttcctcttaa 
aaaatttaat ttttaaagtt 
gtgtttaatt cttaacaatt 
aaaaaagcac taaaaatcat 
atgccatttc tgccaacatg 
agccaagcca catggaggcc 
ttaacttgta ttagaaggat 
ctatataatt ggaaagtgct 
gggtagctgt gttgtgttcc 
gcatttccag aagaaatata 
tttattactg agggaagtgc 
atatttttta aagtttgaca 
tttagtgata atataatctg 
atatgcatgt catttaggcc 
tgtgcaggtg tgtatgtgtg 
tatgtatgta tgtatgttga 
atggacaaca tataaatatc 
tggagtataa tatagccatt 
tgttgcccag gctggagtgc 
gcacaagcca ttctctcgcc 
acacccagat aattttttaa 



aaagaccaga 
agatattgat 
tggagacccc 
taaaaataaa 
caagtaaaca 
ggattgaaca 
tagacagatt 
atcagtgata 
tttcatctgc 
gtgagtaaga 
ttctgagaca 
tcactgcaac 
tgggattaca 
ggggtttcac 
cagcctccaa 
ttattttggt 
ggtaataatt 
agatagcctc 
aaatgggtat 
ataccatgga 
ttctaagaga 
ttacattata 
tacttgtact 
ttgctccaat 
gaagacagtc 
tttattctga 
atgaactttt 
cgttgctcag 
attcaagcaa 
acacccagct 
ggtcttgaac 
gcaggcatga 
gcatttattc 
taatccgtat 
agagaatgat 
aatattagaa 
gaggctgagg 
atgccactgc 
aaaaaaagag 
gtattttttg 
ctctgaaagc 
ttttgaaaaa 
gccttgctgg 
gactcctttt 
gctcattttg 
ttgagtacaa 
ttggaaaaaa 
tgtaaatata 
tctgatcact 
aaattaaaat 
atttgaatgt 
gtgaagactc 
actctttcta 
tgtgtgtgtg 
aggctattca 
tgttataggg 
tgtttctatt 
agtggtatga 
tcagcctcca 
ttttttgtag 



aaagagaatt 
gctgtaagta 
ggaagttttt 
caggtaagta 
gataacagtg 
gacagggaaa 
gtgcttgaga 
caagaccaaa 
tccacagagg 
tgtttccctt 
gattctcgct 
ctctgcctct 
ggcttgtgcc 
catgttggcc 
aagttctggg 
gggaagcaaa 
aaagatttga 
aagaagggga 
taaaagaatg 
gaccaagaaa 
gatgttaaga 
tacatctaat 
tgtctgaaca 
ttctcatgaa 
actcttcatt 
agccaaaggg 
agtgtgcttt 
gctggagggc 
ttctcctgcc 
aattttttgt 
tcctgacctc 
gccaccatgc 
tgcaatttta 
tttagaatgc 
ctccctcttt 
tgctttctgg 
caggagaatc 
actccagcct 
tgttttcttt 
ctgagtatgc 
ccctttatga 
ttatagcttc 
aggctgcagg 
caagtagcag 
gtgacctggg 
tatgtgaaac 
tgtatttaaa 
gaatataaag 
aaatataaat 
aatcagttaa 
cagtgaagat 
tttggaaagc 
agacctagcc 
tgtgtgtgtg 
ttatagtatt 
aaataaccaa 
tatttatttt 
tcatggttca 
gagttactag 
agacagggtc 



gagacagagg 
tggtgtttat 
ttgttacaat 
aatgtttaaa 
aaaaagaatg 
ttgagaggag 
tggtaaagtc 
gcccaggtca 
tgtaccctag 
gaattcgcca 
ctgttgccca 
cgggttcgag 
accaagccca 
agactggtct 
attacaggcg 
ggaggtttca 
gagtaaatat 
gccaaagaag 
agagcaatga 
gatagactat 
gggaccgggg 
taagaaacaa 
cgaaaactgg 
aataaaaata 
ttgtaattcc 
gtgatactga 
atcttttttt 
agtggtgcaa 
tcggtctccc 
atttttagta 
aggtgatcca 
ctggcctgag 
gttttgttcc 
tttctggctt 
taatttaact 
ctgggtgtgg 
acttgaacat 

gggtgacaga 

cctattttcc 
tgacttaaga 
gagttttagg 
aatatccgta 
accaagtcat 
gacagccaca 
taagtaacta 
ttctgtcata 
ataacagcta 
catgcccagt 
atatgaaaaa 
tgttctccta 
gcagggaaat 
aatttggaaa 
ctcagatatg 
tgtatatgta 
gtttgtgata 
attgtggtat 
cttgagacag 
ctgcagcctt 
gactgcaggc 
tcactatgtt 



aagcaggaag 

gaaaggttag 

acagaaagac 

catcttgaag 

ggaccaagat 

aatcagatga 

aatgtgggtg 

aagacaagcc 

gagctgttgc 

gaattacttt 

ggctggaggg 

tgattctcct 

gctaatttct 

cgaactcctg 

tgaaccactg 

gcttttaaaa 

gctttctagc 

caggctatat 

gcagatagaa 

caggaaggag 

aaagccttgt 

tgcgagagtc 

cttttgttta 

aaccttcttt 

cacaactatt 

tatttcttca 

gttttttttt 

tctcagctca 

aagtagctgg 

gagacagggt 

cccaccttgg 

gaatattttt 

taaagcaagc 

tgttactttt 

ttttggcaca 

tggctcatgc 

gggaggcaga 

gcaagactct 

accacttgat 

gtaatgttac 

ctaccaaatt 

cattccccac 

gttgcaatca 

cttaagaagc 

tcatttttta 

ggatacagaa 

caagtataat 

agaaaaacaa 

gatgtctcac 

acacattagc 

acccctccta 

tcagtataaa 

ctcattcata 

tgtatgtatg 

gcaaaaaatt 

acgcatgctc 

ggttttactc 

cacctcctgg 

atgtgtcacc 

gcctaagctg 
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gtctcaaact 
ccaacgtgaa 
gaaaatgtct 
ccgttatatt 
cagtatttct 
aaaaaaagat 
ctctactttt 
taatacatag 
cactcataaa 
tctttctttc 
gtacagtggc 
cctcagtctc 
tatttttttt 
aactcctgga 
tgagctactg 
tgtagacgga 
cgcctcccag 
gcctgccata 
ggtcaggctg 
gctgggatta 
gcaattacca 
ccttccaata 
ttcttcagca 
aggaacaaac 
gacaaaaacc 
aatagagaaa 
tgagaggggt 
agttgtatct 
acaagaatga 
tcagtatctt 
aagtgtacaa 
cccaggccgg 
ctggaattac 
atgtgaatct 
atcagaattt 
gactgaggat 
agaactcttt 
tgatcatgct 
gtggttataa 
catttttcaa 
gttcataaca 
gaattactgg 
cctccaaaaa 
tttcctgtgc 
aatgcatttt 
atatttttat 
agtctgggct 
ggtggattgc 
tctactaaaa 
taggaggctg 
cgagatcgcg 
aaaaaaaaaa 
attgagtaac 
agaaaccaaa 
tttattcaag 
aaattatttg 
ggactaccag 
cgacactatt 
ggtgaaattt 
acacagtcta 



cctggcctca 
ccaccacacc 
aaggcatgtt 
aaaataagtt 
tacccaaatt 
ccatgaacca 
gtgtatattc 
cagacaaaat 
ttgctgatga 
tttttttttt 
gtgattacaa 
ctgagtagct 
tttttttttt 
atcaagcgat 
tgcctggcct 
gtctcacagg 
gttcaagcta 
atgcctggct 
gttttgaact 
caggcatgag 
tatgacctag 
aaaacctgtg 
ggtgaatgaa 
tgttgataca 
agtccctaaa 
ttaagagaaa 
aagtgggtgt 
tggcagtgga 
gtatagataa 
agagtgatat 
gggatctcta 
tcttgaactc 
aggcgtgagc 
agcattatct 
cctcaagttt 
gaagacacga 
tgacaaattg 
atgaaagcca 
tttaaattta 
ggtacgatct 
tctttgtaga 
actgaaaata 
ggttttgcca 
ccttgttact 
atgttaattt 
tggccccttt 
gggcgcagtg 
cgaaggtcag 
gtacaaaaac 
agtcaagaga 
ccattgctct 
aaaaaagaat 
aaataacttt 
agcatagtat 
gtctctggta 
ttgctgaact 
actcaagaga 
gtcctccctt 
tggttagagg 
aacacagtga 



agcaattctc 
tggttcagtg 
taaatgtgag 
cttccaaaac 
tctgcactta 
atggacttct 
aaaccagagt 
gcatatagct 
gaatt taaaa 
tctttttgag 
ctcactgcag 
gggactatag 
gtagagacgg 
ccacttgcgt 
aggcagtttg 
ctggagtgca 
ttctcctgcc 
gatttttgta 
cctgacctca 
ccgtcatccc 
cagttgcact 
cacaaatgtt 
ctggttcatt 
tttaaccacc 
gactacatat 
tgaaaagatt 
agttataaaa 
tgcagaaatc 
aactggggaa 
tgtactatag 
ggtattatta 
ctgggctcta 
gaccatgcct 
catagaattt 
gtgatgttga 
cgtgcttcaa 
atgaaaccct 
atttttaaaa 
gttaaatata 
caaagctact 
tatatccaca 
atgcagtttg 
gtttacatcc 
gcttaataat 
gcttttctgg 
ggaactagta 
gctcacgcct 
gagtttgaga 
tagctcagcg 
atcgcttgaa 
ccagcctagg 
ttacatggtc 
ttaataattt 
ttgtagtttt 
ccagttgttg 
gctaattctt 
ccaaatcaag 
acttcattca 
ctgaaagttt 
agcagagctc 



ccacacaggc 
tagccattta 
aaaagcaagt 
aaaaacatat 
gaaaattgca 
aataaaatca 
gtcaatgtgt 
cagagagtaa 

tggtgcagat 

acagggtctc 
cctcaccctc 
gcatgcacca 
ggtttcgcca 
aggcctccca 
tttgtttgtt 
gtggcccaat 
tcagcctcct 
tatttagtag 
ggtgatcagc 
tggctggtgg 
ctgtatttat 
catagcagct 
cataccatgg 
tggatgaata 
agtatgattc 
agtgtttgcc 
gtgcaacatg 
tcaatgtgat 
atctgaacaa 
ctttgcaaga 
tttttttaga 
gtgatccgcc 
ggccctttca 
aattaaaaga 
caaagatgaa 
aaaaatgatt 
cagtcagttt 
aaattttttg 
agataaatga 
ctttaaccta 
attttccctc 
ctaagacttt 
tcatgaccag 
ttttgaaaaa 
gatttttaat 
tcataagttt 
gcaatcccag 
ccatcctgac 
tggtggcggg 
cccgggaggt 
caacaagagt 
tgaattgcca 
aggcaagttt 
tttatttact 
ctaaaagtga 
ttgcttctat 
cctttctaag 
attcatggaa 
tcattcaaca 
actggctgag 



ctcccaaagt 
gaaatctaaa 
cacagtatgc 
gcaggagacc 
tgtcatgttg 
gtcctgcttt 
ttgtggggca 
aattgtaagt 
gctctggaaa 
actctgttgc 
ctcaggttca 
ccacgcctgg 
tgtttcccag 
aagtgctggg 
tgtttgtttg 
ttttggctca 
gagtagctgg 
atatggggtt 
ccgcctcggc 
tttcttatga 
cccagataaa 
taatattgaa 
aataccattc 
tcaagggaat 
cgtttggata 
agatgttaga 
agggatcttt 
aaaattacaa 
gttagagtgt 
tgttaccatg 
gatggggttt 
tgccccagcc 
gtattgtatc 
aattgtaaac 
ctagttgaca 
tgaatatcaa 
tataagaatg 
tctttcctaa 
ttttttatta 
ctatgaatga 
aggataagtg 
gctatctgtt 
cgaatgagag 
aatctaattt 
gaggttgagt 
tttttcttaa 
cactttggga 
caacatggtg 
tgcctgtaat 
ggaggttggt 
gaaaagtctc 
ttaaaagaga 
tggacgattg 
ttagttgcta 
ttgactaatc 
cttttaggca 
acccttgaac 
cttcggcgaa 
acttggtcgc 
cctgtctctc 



gctgggatta 
aaagacgtgg 
atggtaaaat 
tttattttgt 
tcataagttg 
tgacatctct 
cacttagcaa 
tttgctagat 
acaggcagtt 
gcaggctgga 
ggtgatcctc 
ctaatttttg 
gctggtctca 
attacgggcg 
tttatttatt 
ctgcaacctc 
gatgacaggt 
tcaccatgtt 
ctcccaaagt 
cgtgaaacat 
tgaaaactta 
aaactggatg 
agcaataaaa 
tatgctgtca 
atattcttga 
gacagggagg 
gtgatgttga 
agaactaaaa 
tgtatcactg 
ggagaaacta 
cactatgttc 
tcctaaagta 
ttagaacttc 
ctcacagaag 
ctgacagtaa 
tggattaaga 
cccatcttta 
caattagctt 
agtttagttt 
ataatgctga 
cctacaagtg 
cctgaatgct 
tgttgcctat 
gacagacaaa 
atagttttta 
gaatttatgt 
ggccgaggtg 
aaaccgaatc 
cccagctact 
tgcattgagc 
aaaaaaaaaa 
tatgagaatt 
tactttgttt 
ggaagtaaac 
tgtcaatctg 
gatcttgtct 
aagtcttgca 
tggagcattt 
gaataagagc 
catctaaaaa 
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gcatgaaact 
ctcagcacag 
cactcagaat 
aatggccaga 
tacagtagcc 
gaaaagtgag 
tactaagcag 
tacttgtaaa 
gagcaggagt 
aatgtcctac 
ttcctccctc 
aaaatttt ta 
ggattttatt 
tt ttgagatg 
actgcaaccc 
aggcgcctgc 
catgtccgcc 
aagtgctggg 
agaaaaccag 
ctagaagcat 
atactcttaa 
aatgt tcttt 
tcagtgaacc 
ttcctccctc 
gcgcccgcca 
tggtcaggct 
tgctgggatt 
actgtaagct 
gagaagaccg 
gctcaaacaa 
tcttattcct 
ttaaatgtat 
atttctttgt 
gtgaacatgt 
gttacttcag 
tctggtacca 
caggtttata 
ctggagtatt 
tctgaattcc 
tttctgcctc 
gagagaattc 
tgtttctgta 
caatgacgta 
ctcagccccc 
atttttagaa 
gtgatctgcc 
tttattgttt 
cagtggagtt 
tatatatagt 
agattcctgt 
gtcaagcccc 
tttgcctttt 
ccgagctgga 
gcagttctcc 
ggctgatttt 
caaactcttg 
cgtgagccac 
tttttcttgt 
gtcatgaaga 
gcttatgcct 



acagcgtctt 
ttgtttatga 
cact tgctgc 
gcaggaactc 
agtagaaata 
tatgtgattt 
aacttcagat 
tttgggagaa 
gactggacct 
ttttcccctc 
ggtcttaatt 
aaaattgcca 
tattagtcac 
gagtcttgct 
ccgggttcat 
caccacaccc 
aggatggtct 
at tacaggca 
ggcttagaaa 
tttgacaaga 
ttatcacctg 
gtgtcttaaa 
aaaatgaggt 
cctcccccct 
ccactcctgg 
gatcttgaac 
acaggcatga 
gggagaagtg 
cttgagccca 
agaaaaaaag 
ttcacccttc 
atttgtctga 
ttcttcggat 
ttcttggact 
gtgttttgca 
cttaaaactg 
cttactgtag 
aatatgctct 
agaatactac 
ccactatttt 
agtattggga 
attgtttttt 
ctctcagctc 
tgagtagctg 
gagatggggt 
cacctcagcc 
ttagaaactg 
attaaaagag 
aagtttgacc 
aactgtcacc 
acctctatcc 
ctcttttttt 
gtgcagtgag 
tgccttagcc 
tttgtatttt 
acctcaagtg 
tgtgcccaat 
atggattgtg 
ttttctcata 
gtaatctcag 



ttttaactga 
ctcattcaga 
t ttcccagga 
accaagtttc 
gtcccgcttc 
tcttgtgtgt 
gaggaataaa 
tttggagagt 
tctaagaagt 
cactgatttt 
ttattaatat 
ataagtgaca 
aagacctttg 
ctgtcgccca 
gccattctcc 
ggctaatttt 
cgatctcctg 
tgagccaccg 
ggttaggtaa 
gcacctgttt 
ggattttgat 
gggctaagtg 
ctaatctgct 
tccttccctc 
ctaattttta 
tcctgacctc 
atcaccacac 
gcacacactt 
ggagttttga 
ttattgaatt 
attcccactt 
taattctgct 
tcagactgtt 
tttgtctgtg 
ttttcttttg 
aatttttgtc 
aaatatggtg 
ctgttaaact 
tggccccaaa 
ccttagttta 
agagtttcta 
ttttgagatg 
actgcaacct 
ggattacagg 
ttcgccatgt 
tcccaaagtg 
tctttgcttt 
cattagttac 
tttttaaaat 
actataaggg 
caacacttgg 
ttcttatttt 
gcaatctcgg 
tccctagtag 
tagtagaaat 
atccacctgc 
caggactttt 
ccttcagagt 
tgtttccttt 
cactttgaga 



ttctcttgat 
aggaattgac 
atgtgacagt 
catggaaacc 
tccactaaaa 
acatatgtgt 
atgattggaa 
gtagtagagt 
gtgttatcag 
gacatcaaac 
tttactgcac 
tttattaagt 
tgcaggtagt 
ggctggagtg 
tgcctcagcc 
tttgtatttt 
actttgtgat 
cgcccggact 
cttcctctag 
ttttttcttc 
tagacagcct 
atttcttcag 
actgaatcaa 
aaccaggctc 
tattttagta 
aagtgaccca 
ctgacggcat 
gtactcccag 
gaccaacctg 
ttttatttct 
ttgatcccat 
atctacagtt 
ggtggcttgt 
ggaattctct 
ccatgcacct 
ttgggtgctc 
tttgattatg 
taatgtgttg 
tgtttaagat 
acacaaactc 
acctgtttct 
gagtctcact 
ccacctcccg 
tgcccaccac 
tggccaggct 
ctaggattat 
agtggtaatt 
atttttccct 
gtatacttgt 
taaagaacag 
caaccgctga 
tttttttgag 
ctcactgcaa 
ctgggattat 
ggggtttcac 
ctcggcctcc 
tttttttaaa 
cacacctaag 
taaaagtatt 
agctgaggtg 



aagagattgg 
ctgaataata 
gcccattctc 
caagaatctt 
gaattgtcag 
ctcactttct 
tatttttttt 
cagatcagtg 
aattagtaaa 
cattatccac 
tttgcagata 
tcagtgctta 
aggcatgatt 
caatggcgcg 
tcccaaatag 
tagtagagac 
ccgcctgcct 
gattatctta 
gttgtacagt 
tctattagtt 
tcatgttctt 
atcttttagt 
gttttcagca 
ccgaggagct 
gagacggggt 
cctgcctcgg 
gttattttca 
ctactcagga 
ggcaacacag 
atggatcatt 
cttttattta 
ttttgtggac 
gattttagtg 
gtgtactctg 
ggggcctggg 
gtactgatcc 
gggtattgtc 
tccctgtaaa 
aagggcactg 
acctttttaa 
ggaaatggaa 
ctgtcaccca 
ggttcaagcg 
catgcctggc 
ggtcttgaac 
gtttctgtaa 
ttcaataaaa 
ttttcattat 
atcagtttta 
ttagttcctt 
tctttctccg 
acagcgtctt 
cctccgcctc 
aggcacgcac 
catgttggcc 
caaagtgctg 
tttacattca 
agccctttgc 
gtggttggcc 
ggcagattac 



aggattctgg 
gaactaacag 
tccgtcttga 
cctctacact 
gaaaactaat 
ttttttaatt 
ctcctctaac 
tatggaaaag 
tgaagggtca 
atagccttat 
aaatttttaa 
gtgtatattt 
atcttttttt 
gtctcggctc 
ctgggactac 

ggggtttcac 

cggcctccca 
tttacacatg 
aaatgtggac 
tagaaattat 
tttcatctta 
tcactcattc 
tgttatttcc 
gggattacag 
ttcaccatgt 
cctcccaaag 
tcgcaaagtt 
agcttaaggt 
caagacccca 
ttttgtagtt 
tttagtttta 
ctgactcagc 
atttttggcc 
tataaattaa 
tcactaccct 
tgtatgagta 
ccagatggtg 
actccaaaat 
cctgtatttg 
aaaacatttt 
gtccaaagtc 
ggctggagtg 
attctcttgc 
tgatttttgt 
tcctgacttt 
ttgtaataca 
atagaaatag 
cttcaaatat 
acacatacat 
cacctttgaa 
tctcaatagc 
gctctgtcgc 
ctgggttcaa 
caccacaccc 
aggctggtct 
ggattacagg 
acttgtcatt 
ctaagcaaag 
aggtgccatg 
gaggtcagga 
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gatcgagacc atcctggcta atgcggtgaa accccatctc tactaaaaat acaaaaaaaa 10674 0 

aaaaaaatta gccgggcgtg gtggcgggca cctgtagtcc cagctacttg agaggttgag 106800 

gcaggagaac agtgtgaacc cgggaggtgg agcttgcagt gagccgagat cgcgccactg 106860 

cactccagcc tgggcaacac agtgagactc catctcaaaa aaaaaaaaaa agtattatgg 106920 

ttctacactt tacgtttaga tatatatctt ttttgagtta atgtcgtata agtatgaggg 106980 

tcacgtcaga ttttttgttt tttgtttatt tttacatatg gatgtctagt tgttctaaca 107040 

ccacttgttg aaaagacaac ctctactcca ttgaattgcc tttgtacttt tgccatattt 107100 

gcccaggcct gtttttggac tcctttttct gtttcatgat gtgtgtgtct attcctttgt 107160 

taataccaca tggtcttaat tactgtatag taagtcttaa aattgggtaa tgctggcctt 107220 

acaaaacgaa ttgggaagtt tttattttta ctcttatttc cattttctag aagagattgt 107280 

gtagaattgg tgtcatttct tctttagata tttggttgaa ttgggaagtg atgccatctg 107340 

ggcctagggt tttgtttttt gtgtgtgaga cagagtctca cttctgtcac ccaggttgga 107400 

gtgcagtggt gagatcttgg cttactgcaa cctctgcctc ccaggttcaa gttatcctcc 107460 

cgccccagcc tcccaaatag ctgggattac aagcgtgtgc caccatgccc gactaatttt 107520 

cgtdttctta atgcagacag ggtttcacca tgttagccaa gctggtctcg aacttgtgac 107580 

ctcaagtgat tagcccacct tggcctccca aagtgttagg attatagatg tgagccaccg 107640 

tgcctggcag gggcctaggg ttttcttttt cagagtattt taaactatga attcagatta 107700 

tttaatagat ataggactat ttaagttatc tgtttcttct tgagtgaatt tttactgtag 107760 

tttatggcct ttgagtaatt aattgtattg aattgtcaaa tttatgagcg tgtaattatt 107820 

tacagcattt cgggtttgta gtggtatccc tcttttattc ctggtgttgg caattgtgtc 107880 

ttgtttttct ttgtcagatt gtatagggat ttattagtct tttcaaagaa ctagcttttg 107940 

ttttgatttt tctgttgttt tgttttcaat tttattgatt ttctgctctt tattatttct 108000 

tttctat tat ttctgcttgc tttgggttta ttttactctt ttttttttct ccaagttgct 108060 

taaagtagaa acttagattt ctggtttgag acctttcttt tctaagataa gcatttaata 108120 

ctgtaaattt ccttctaacc actgctttag ttacaccccc acaaattctg gtattttgaa 108180 

ctgagcacaa atgaaatgtt ctaatttccc ttgaatctta ttcttttacc aatgaattat 108240 

ttagaaatat gttatttagt ttgcaagcaa ttggagactt ttttcctgtt atttttctac 108300 

catttatttc tcatttcatt atattatggt cagagaatat attttgaatg atttcattta 108360 

ttaattttta aaaataacat taaaaaattt tttaaaatgt gaatatacca catacagtat 108420 

aaagattgta cattctgttt ttggacagtt ttctataaat gtcaagttga tttagttggt 108480 

taatgatggt gttcagtttt tctttattct tgctgatact ttgtatgcag ttatatcact 108540 

ttattactca gaagagtgtt gaactttcca actacaattt ttttttccaa ttttactttc 108600 

agctctatct ggttttgctt catgtatttt gaggctctgt tgttaggtgt gtacacattc 108660 

aggatgatat cttctgggtg aattgcctgt tttatcatta tgtaattccc tctttatggt 108720 

aattttcctt gttctaagat cagaaatatc tgttgtccaa tttatataga cactgcagct 108780 

ttcatttgat tagtgcttgc atggcatatc tttttccatt tttttacttt tgatctacct 108840 

ttataattct atttaaaggg ggcttcttgt aggcagcata tagttgggta gtgttattta 108900 

tttatttatt tatttattta tttatttatt tattgagaca gagttttgct cttgttgccc 108960 

aagctggagt gcagtggtgc aatcctggct taccacaacc tccacctcct gggttgcagt 109020 

gactctcctg cctcagcctc ccaagtagct gggattacag gcacgcgcac catgcctggc 109080 

tgattttttg tatttttagt agaaacggat tttcaccatg ttagccaggc tcgtcttgaa 109140 

ctcctgacct caggtgatcc acctgctttg gcctcccaaa gtgctgggat tacaggcgtg 109200 

agccactgca cccggctgag tcatgttatt tttaatcttt tctcacaata cagggttttt 109260 

gttggtaaat ttaattattt taatataaat tttagtataa ttatttacat taaatgtaac 109320 

tgttgcactg gggtatttat aatgtgtaaa tataattatt ggtattaata taattatatt 109380 

actcataata atattaatat ctttggattt agattaccag tttagtatat gtttttctgt 109440 

ttctccctct ttgatttccc cttttttgct tttttttttt ttttaattct tatttttttt 109500 

tagtatttgt tgatcattct tgggtgtttc ttggagaggg ggatttggca gggtcatagg 109560 

acaatagttg agggaaggtc agcagataaa catgtgaaca aggtctctgg ttttcctaga 109620 

cagaggaccc tgcggccttc tgcagtgttt gtgtccctgg gtacttgaga ttagggagtg 109680 

gtgatgactc ttaacgagca tgctgccttc aagcatctgt ttaacaaagc acatcttgca 109740 

ccacccttaa tccatttaac cctgagtggt aatagcacat gtttcagaga gcagggggtt 109800 

ggsggtaagg ttatagatta acagcatccc aaggcagaag aatttttctt agtacagaac 109860 

aaaatggagt ctcccatgtc tacttctttc tacacagaca cagtaacaat ctgatctctc 109920 

tttcttttcc ccacatttcc cccttttcta ttcgacaaaa ctgccatcgt catcatggcc 109980 

cgttctcaat gagctgttgg gtacacctcc cagacggggt ggcagctggg cagaggggct 11004 0 

cctcacttcc cagatggggc agccgggcag aggcgccccc cacctcccag acggggcagt 110100 

ggccgggcgg aggcgccccc cacctccctc ccggatgggg cggctggccg ggcgggggct 110160 

gaccccccac ctccctcccg gacggggcgg ctggccgggc gggggctgac cccccacctc 110220 

cctcccagat ggggcggctg gccgggcggg ggctgccccc cacctccctc ccggacgggg 110280 
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cggctgccgg 

tcctcacttc 

cggtcgggca 

tcccatccca 

cgggcagaga 

t tcccagacg 

cagagacgct 

ctttgggagg 

actgcactcc 

ggcacctcgg 

gccaacacag 

gtgcgcctgc 

cagtgagccg 

agagggagac 

ggattatttg 

gattctcctg 

taattttttt 

actcttgacc 

gagccaccat 

tatgtaggct 

tcacatttaa 

gaactaggga 

cgtttttttt 

tggagtgcag 

ctcctgcctc 

tttttgtatt 

accttgtgat 

cgcccggcta 

attttagtaa 

cttctttaaa 

aattcactgg 

aattttgttt 

ttgctcctga 

tagctactta 

ttgtgagata 

gacagtttga 

agtttttgct 

ttctgtcggc 

tctgtgtcca 

aagcagggcc 

ttttcatggg 

cctttatccc 

gcacacttcc 

gtttctcttg 

gatgctgcct 

tccactctct 

ctgtcttgga 

aggaggtgct 

tcagtccacc 

tgggagagac 

aatcaacgga 

aaaataacac 

tttttttttt 

ttggtaaagc 

ttattcttcc 

ttttctcctt 

tgagtttatt 

ttttccatag 

aagcaccttt 

tattcatgaa 



gctgaggggc 

tcagacgggg 

gagacactcc 

gacggggcgg 

cactcctcac 

gggcggccgg 

cctcacttcc 

ccaaggcagg 

agcctgggca 

gaggccgagg 

cgaaaccccg 

aatcccaggc 

agatggcggc 

cggggagagg 

aatttttcct 

ccacagctcc 

gtatttttag 

tcaagtgatc 

gccctgcctt 

ttttagtggc 

tattttgtaa 

tgaggttaaa 

tttttttctc 

tggcgtgatc 

agcctactga 

tttagtagag 

tcgctctcct 

agtctttaaa 

cccaaatgtt 

tttttttttc 

ttctttcccg 

attatgtttt 

aactcttatt 

aaatttgttt 

atgggatttt 

cttacgttac 

ttaacaagca 

tctggttcca 

gtctgggacc 

atgcacaccc 

ctccttcttt 

tttcctgttg 

tgtagctatg 

tcagaaagta 

gggagtcgaa 

ttgatccgtg 

acttcttctt 

ggacaaactt 

tgcttccaag 

agaagagtgt 

tgatattctc 

tttaacaccc 

tttgagacgg 

atcaataatt 

attttagcag 

cttagtgcct 

ttggtgtaaa 

ctttgaacac 

gtgatggaag 

tataattcat 



tcctcacttc 

cggccgggca 

tcagttccca 

cggggcagag 

ttcctagacg 

tcagaggggc 

cggacggggt 

cggctgggaa 

acattgagca 

caggcagatc 

tctccaccaa 

actctgcagg 

agtacagtcc 

gagagggaga 

taaatttatt 

caagtagctg 

tagagacagg 

cacctgcctc 

tttctagaat 

ttctctagga 

cttcaagtgg 

aaagagagag 

tttttttttt 

ttggctcact 

gtagctggga 

acggtttcac 

cagcctccca 

tatttttttg 

agttttgtta 

ctgttgttca 

ttatttccat 

ttagttctaa 

tgtttcagga 

tatcatccca 

ctggttcttt 

atgattctga 

gttgacctag 

tcatcagttc 

tggccaatgg 

agctcacgag 

tctgtgatgt 

tctggctaga 

tcaacctctg 

ggattcttgg 

ggagagaaag 

agagccccct 

tgtgcatctg 

gtcaggagta 

tccttggatg 

gcttatttca 

tatattaatt 

gcctttggtg 

agtctcactc 

ttatctttca 

aattcatgtt 

cagagtagat 

agtactttga 

ccccatgtaa 

tttattttgc 

tactggagtc 



gcagaccggg 

gagacgctcc 

gacggggtcg 

gtggtcccca 

ggatggcagc 

tcctcacatc 

ggcggccggg 

gtggaggttg 

ttgagtgagc 

actcgcggtc 

aaaatgcaaa 

ctgaggcagg 

agcctcggct 

cgagggagag 

tatcttactt 

ggactgcagg 

gtttcaccat 

ggcctcccaa 

ttatatattg 

attacaatat 

aatgtagaaa 

aaaagaaatg 

gagacagagt 

gcaacctccg 

ttacaggtgc 

tgtgttggcc 

aagtgctggg 

acattgcact 

ttgtttggca 

gcttcgaaaa 

tctgttattg 

aattttcttt 

gtgatcttat 

gcatatgtgt 

atatgacaat 

atcttgttta 

ttaggttcag 

agttttgtat 

tcaggtccca 

tggccccggg 

ccctgacacg 

aagtcagggc 

tggccacgac 

agctgctgtc 

gaacaaaaca 

ttcctgttcc 

gtgtgcagtt 

cggaggtact 

catttgtcca 

tcttgacata 

tgctgttttc 

gtttctgtca 

tgtcctttga 

tccacacaag 

gctccaatag 

cctgttcaga 

aattcatgca 

ctctcctctt 

aataggaact 

caagttgctt 



cggctgccgg 

tcacctccca 

cggccgggca 

catctcagac 

cgggaagagg 

ccagacgatg 

cagaggctgc 

tagggagctg 

gagactccgt 

aggagctgga 

aaccagtcag 

agaatcaggc 

ttcacaactt 

cccctttttt 

atttatttat 

catgtgccac 

attggccagg 

agtgctggga 

agttcttgat 

acatactttt 

acttaaccac 

taataaagat 

ctctctttct 

cctcctgggt 

gcgccaccat 

aggatggtct 

attacaggcg 

ttttctcttt 

ggttcctgag 

tttctattca 

agtctttgta 

ttttgtgtat 

ttcttagagc 

cctcttgatt 

taattttgga 

aatcctgtgg 

tccacaaatt 

cttatctgct 

aagcctttgt 

agtgcacata 

ttctgccttc 

tttagattcc 

ttcttcttct 

attgctgctg 

aaacaaccca 

tcagaccaga 

tcagcttttg 

gcaagttctg 

ttgttttgag 

cttattagga 

cctttagcaa 

taattattaa 

ggcattgtcc 

cttcaccata 

gggctgtctt 

tacgttataa 

tagttttttc 

ccacaaacca 

cacagtgatc 

tttggttttt 



gcggaggggc 

gatggggtgg 

gaggcgctcc 

gatgggctgc 

tgctcctcac 

ggcggctagg 

aatctcggca 

agatcacgcc 

ctgcaatcct 

gaccagcccg 

gtgtggcggc 

agggaggttg 

tggtggcatc 

gctttctttt 

ttttttgagt 

tacacccagc 

ctggtcttga 

ttacaggcgt 

tgtatctttt 

cacagtgtac 

cataaaaata 

ttaataacac 

gttaccaggc 

tcaagtgttt 

gcccagctaa 

cgatttcttg 

tgagccaccg 

tccttctagg 

gctttcctta 

tctgtcttca 

gtgaatttta 

gtcttatact 

atggttttag 

gtcttttctc 

ttgtatcttg 

aaaatattga 

ctaagcagca 

tatgtgcctt 

acacttttag 

caactcgacg 

taagaacctc 

ctatacttca 

tgggactgca 

tggctgctct 

ggggatttcc 

aatagagggc 

agtccaggcc 

attacttttc 

ttgcattcca 

tttcatatca 

gcacattagg 

tacttgactt 

ccataaactt 

aatttgatgt 

caaactgatg 

caggttaata 

atcatatgca 

aacaatgaaa 

taagccctgc 

gaagttctct 
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tcttcccttg 

gatgctgcta 

tgcttgaaac 

attgactaag 

ctgagtattt 

cgagggtctt 

gctgagcagc 

ccggcagccc 

cagtgaggct 

cacgaccagt 

gggcttcatc 

attagccaga 

acacagtgtc 

atgtccagtc 

aaaaaaaaaa 

cctgaataac 

tcccagacaa 

tgtgggataa 

tttgtccagg 

accttaactg 

tatgccctca 

atatatatat 

tgtattagta 

atatacatat 

gagtagtaag 

tatgctgatt 

ctgtgtttta 

ttcattcaat 

aaagttagta 

ccgatgttgg 

cagcaagacc 

tagctattct 

tacctataac 

tatttaaggg 

tttctgtgtt 

gcttttccta 

ttggttggct 

tgtggataaa 

aactctcttt 

aagtcttctg 

atcccagtgt 

tcatttttcc 

gttggactat 

tagctcttac 

ggtgttgcta 

tgcactacct 

aggtacagct 

ggaataaagc 

actctcgcca 

ggttcaagcg 

cacgcctggc 

aggatggtct 

gttacaggca 

aaacatgcag 

gagaggggac 

cacatagaag 

gcagtacacg 

tctgattgct 

agaggtgtta 

ccttctgaac 



caggtataga 

aaccaatacc 

tatggcaaaa 

ataatttttt 

tttatatcgg 

tgcctgggcc 

cgggccggcg 

cgcaccagtg 

aaaagccggt 

gggcaactgg 

ccacttctca 

gccatcacat 

atttctgtgt 

ccagtttcac 

aaaaaaaaaa 

aatgacagca 

tattccaagc 

atacagttat 

gtaacagctc 

ctgtgctgtg 

ccccctgcca 

atatacatat 

tatatgcata 

tagtgtgtgt 

gacaaacatt 

caacaaatat 

tcaggaagac 

aaatattaca 

tttgtgattt 

gcttctttaa 

cctgttagtc 

ggggtccatg 

ggctataaca 

gtactgtttc 

gttgctctat 

gtaatacaac 

agtgattttt 

aaggaagcag 

cttttactta 

aaggatcaag 

ctatcattat 

tccttgagcc 

tttaatatag 

tgtgtaccca 

actaattact 

ccttcatctt 

gacagaattt 

atgactgcat 

ggctggagtg 

attcccctgc 

taattttttt 

ccatctcctg 

tatatataag 

tccacacagc 

agttgttaca 

attgcttaga 

ggtagagatg 

tctatattct 

aatatttgag 

tctcacaaaa 



acaagatgca 

aattacagaa 

aaaaaatgac 

cttaacatgg 

attatagctc 

agatgggctg 

ggcggctacg 

cacgaagtgg 

accaaagtct 

atggccagac 

gtgggcctga 

ggcctgtgac 

catttggcac 

gtaactttat 

agtttttctt 

agatcaataa 

actttttatg 

tttatagatg 

agatatggca 

gcagtgtttt 

aaaaaaaaaa 

aatatatata 

tatagtatat 

atatatatat 

tcagaaaaat 

atttcttata 

cttaggtgaa 

ttctcataag 

atgaaataag 

tccttagtgt 

tcagctgtgt 

tcatgttggc 

taggcctggt 

actgagtttt 

ttttatgtgg 

agggatgttc 

ttttgagggg 

tttcaagtca 

agcttaatca 

ttgataacat 

atattttagg 

ccattcttaa 

ctgtccttca 

ctttgcatag 

gtttttatgt 

ccacaaatgt 

gctgatggtt 

tttttgtttg 

cagtggcgtg 

ctcagcctcc 

ttttgtattt 

acctcatgat 

catataaagt 

tgataggaat 

ggaaagaagt 

tgggagcaag 

caggtgagtt 

caaggaagca 

aaaggagatg 

cagaaccctt 



gtgaatactt 

gcaatgagaa 

aaaaaatgca 

aatttagcag 

actttaaaag 

cagtgtagcg 

ctaaccggca 

gcgggacaga 

ctaggcatca 

aggtgtctca 

cgtccctggg 

ttgccttttt 

agctggaggt 

tcttctgaat 

atatgttgga 

atagtacaca 

gatagactca 

aagaaactga 

gagtcaggat 

tcatactgta 

aaaaaaaaaa 

tatataaaat 

attatatatt 

atactagaat 

gttttcatta 

ggttatagca 

cgtatattca 

tcctaatatt 

acatgttctt 

gggtgctttg 

ttcttaaatt 

tccattttcc 

ggctgttggt 

gctgacagat 

gaatttgcta 

tgactgatta 

agtctgtacc 

aataaaacac 

aattaatgat 

tttgtgatca 

atgttaatta 

tcctgtccaa 

agtgagtttt 

tcttgtttta 

gaggatttag 

ttgaagtggt 

tggaagtgag 

tttgtttgtt 

atcttggctc 

caagtagctg 

tagtagaaac 

ctactcacct 

gtgttatagc 

gaggcagtag 

ctggaggcag 

gacaatttat 

gaaagatgtg 

ggaagcaaag 

tactgtagaa 

ccatgactct 



ttaccaaata 

atgacatcat 

cagaactgac 

ttcccttcct 

tttctcggct 

ggtgctcagg 

cagaccaccg 

aacttctggg 

gggctgcagc 

gtggtggcct 

caccctggat 

ttgccagttg 

gcaaggagga 

aaagacaatt 

cccaaattct 

tttattaaac 

ttttaacttc 

agcacagaga 

ttgaaactag 

ggttgggacc 

aaatatatat 

atatatatat 

agtatatata 

aaaaaaatca 

tatatacatg 

aaatagtt tg 

cagataaaag 

atgtattttt 

gcacttttag 

cactcactca 

ggcccactgt 

ttttctttct 

ggcttatccc 

gttgtcatga 

ctatcatcat 

gagtttgcct 

agttaatagc 

ttaaaatgaa 

gatgtaatcc 

aagaatttga 

cctgtgtggc 

attatttgtc 

gttcaaagga 

aatgtaatcc 

agtgatccag 

agaattttta 

tggtatgaga 

tgtttttgag 

acggcaacct 

ggactacagg 

ggggtttcac 

tggcctccca 

atacaaacag 

tgaaggagaa 

aagggatgaa 

ctagagtcac 

agagatgatg 

tcctcagcaa 

aaaaaaaaaa 

agttgtgtgg 



tatatctcca 

aggtaagcag 

aattttcgtt 

aatttgtttt 

gcattcggtg 

cctgcccgct 

gatggactgg 

gttggaagtc 

ccaagagtct 

ctccgtctca 

gtctacctgc 

attgtgccac 

gggcagcctc 

tgctaacctt 

taggctttaa 

actcactgtg 

taaagaactt 

agttaagtgc 

accctcacat 

agccttctct 

atatatatat 

ataaaatata 

ctaatatata 

aagtatctca 

tatgtatgtg 

aaagctttta 

aggttattta 

attcttcaaa 

cagatctgtc 

ctgctgggga 

accttccagt 

cccacacaga 

tatctgcttg 

gatttgaggt 

ccctagacca 

gtttgaagaa 

ctgactggcg 

accacactgc 

catgaaggaa 

gaaaacctct 

tttaggcaag 

tcctcttgca 

gccttcactt 

ttggattttt 

aatctatact 

aaaactttga 

gggaaaaaaa 

acggagtctc 

ccgcctcctg 

cgctcgccac 

cgtgttggcc 

aagtgctgag 

gtatatatat 

gttgatgtag 

ttccagtgct 

aggaaagaat 

gaaataattt 

agagaataga 

ctcagtttct 

ggttttttcc 
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ctgtcagcta 
tcagttcagt 
ccgtcccaca 
cttctgactg 
c tgctagagc 
aaaggatatt 
atgggagagg 
ctacagattt 
agactccatt 
cccctcatcc 
ggtctctctg 
ctcattagcc 
actggctaaa 
ggcaggagga 
ccttacaaaa 
tcagaaggct 
cactgactcc 
aaaactgggc 
aagtgaatgt 
aaagtcacga 
aacagctagc 
gcacatttca 
ct ccacaaat 
gtatgttcat 
gtatccccac 
aaaaacaatg 
tttgattact 
tggattgtca 
ctcttggtat 
tgtttgtttg 
tgttgctaaa 
agatcaagtg 
tgtgctgtta 
acttaattat 
cccaactgtt 
agttcatact 
cactcagcia 
tataaagcat 
aaaattttca 
ttctgaaata 
atttatttaa 
ctaaatggat 
ttagatttat 
tatcagaaga 
acagcctaat 
ttaaatgtca 
aggaatcagg 
atagggacag 
tcaaattttc 
tctgtaaata 
tctgtgagtc 
attcgagcat 
gaagagccaa 
agtaatttaa 
ttgagaaacc 
tattttaaaa 
agaagagaaa 
ataggattgc 
gcatacagtt 
aagcaaatga 



ccaattctgc 
tctcacactg 
agactgcctc 
accttctata 
agccctcaga 
acaaaggata 
ggcacagagc 
agctattcag 
atataggcat 
cgggaggttg 
gtgattagcc 
ttcaaaagaa 
ggccagttgc 
tcgtttcagg 
aatttaaaaa 
gaggatcact 
ggcttgggtg 
aaagactaaa 
aaacagacca 
acacaaagtg 
taaatagcta 
tgttttaaaa 
ccataatttt 
ttt taaagtt 
taacactagt 
ttgggccagt 
agcgagtatg 
tgtcctttgc 
tttaatgata 
cctttgtatt 
tcttttgatt 
ttacctgtat 
actcccaagt 
ttacaacttc 
tcgttttggc 
ttcacattgt 
aaaacaaagt 
actctatctt 
ttttcatttt 
atttggggag 
ataatgcatg 
gctatataac 
tactgctatg 
gtgctctgtg 
aaaagacata 
ggggagaact 
aatgacacca 
tatggaaaga 
ccttttgtcc 
ccagattgaa 
agccctcttt 
tccagcctct 
gtcatttcca 
taatatatta 
taattaaact 
ccacagaatt 
aaaagtagta 
ccagtgaaga 
tagtataatg 
ctattaagta 



agatgattgt 
tttacctgga 
cacttcagat 
aantggagtt 
actcagggaa 
cagattgaac 
ttccatgcac 
aagcccccct 
gattgatcat 
gtgggtaggg 
ctcatcctaa 
tccagagatt 
aatgtctcag 
ccatgagatc 
ttggccaggc 
gagccctgga 
acaaagtgag 
taacatattt 
ggacactagt 
agactaggca 
attgtttcgc 
tttctaccaa 
ttaagttaca 
atgctagaaa 
gttagttttc 
ttactagata 
tatgtctttt 
tcatttttct 
gtaacctttt 
ttgtttttgg 
tctgcttttg 
tttcttttag 
tgattcacaa 
ttttgcagca 
acagtccata 
gcatcctagg 
atttttgaga 
ggttaacagt 
caatgttaat 
tgattgagtc 
tcttcagatg 
taaatccaca 
tgcccttaaa 
taagacgtgg 
gggcatgttg 
acaaagtcat 

tggggagtaa 

gtatattttt 
atgtgcaggc 
gtgctgacca 
tatttctctg 
aactatcaat 
aaaagatgta 
gagagaacat 
actgcatgta 
tgaaacttgc 
aattttttct 
agcatttgca 
ctctttgtta 
gaaagaggat 



tcagtgaaca 
gatagcatca 
gccagtctca 
cccacagtcc 
atgctttaca 
aggcagatgg 
tctccaggtc 
ccccattctg 
tggctattgg 
ctgaaagtcc 
agctctttag 
ccatgaattt 
gcctgtaatc 
aaaaccagcc 
gtaatagctc 
gttgaaggca 
accttgtctc 
cacagtatca 
atgatccctt 
tcatgttata 
tgcagtttat 
taacatttta 
atcccagaaa 
ctgccaaatt 
ttgtgccctt 
aaaggtgtag 
cacgttggtc 
tttggaacat 
aactgtcatg 
agggtttcta 
catatgtact 
ttctatttaa 
gtgtgtatac 
aggatttgtg 
gtctttagtg 
gaatttgggt 
atttaaatat 
ttcttttaaa 
atttcctaag 
tgtagtgatt 
gctctcctaa 
tagatttgtt 
aaaaatctat 
ttaggcatag 
tttggttact 
aaaaaggtgg 
ggtagtgttg 
cccacttaaa 
actttagtga 
gtggaactgt 
aggtaaagtc 
gctggggccc 
tcattgtttc 
gaaaattcaa 
agagagtgca 
ttccagtgca 
tatgctcatc 
acagacaatg 
ggcttcaaca 
tcccagtctc 



ccaactgggt 
gatcccacag 
agtacaagtt 
cccccttggg 
tatatttacc 
aagagatgca 
atgccaccct 
tccttttggg 
tgatcagctc 
caaacgtgta 
aggccacagc 
taggcgctgt 
ccagcacttt 
tggtcaacat 
ttgtctgtag 
gcagtgagcc 
agaagaaaaa 
cagatttgta 
ggtttcatga 
tggtttttcc 
tttagcagtt 
ataaactttt 
tagaattgct 
gccttcagaa 
gctcaagtat 
tgcctcctta 
attttatgtt 
ttcttagtag 
catgctgcaa 
tgtataggaa 
tcaaaagact 
aacctcttaa 
atagtttgaa 
gagaagatgg 
caatggagca 
tcattgttag 
tttggatatt 
tataaattat 
ttaaaataat 
atgactatta 
tttgttagtt 
gaaatggctc 
tcattctttc 
tgccagtctt 
gtaatatgaa 
gagagattac 
acctaggcct 
ctctttcctt 
gtttctgcga 
ttacctggct 
tgcatttctt 
tgtctatagg 
aagttgtttc 
tgtattaaat 
tgtttttaat 
taaattgcag 
atttttactt 
agtatattaa 
agtgaaatta 
acaaagcagt 



gtcctctaag 
attgaggact 
gtggcctgtg 
ttcaataaat 
catttattat 
tgggcaaggt 
ccaagaacct 
ttttttgtgg 
aaccttcagc 
attctgcctt 
cacaagtcat 
atgctaagaa 
gggaggctga 
agtgagaccc 
tctcagctac 
atgatcgtgc 
ggaaaaaaaa 
ttgtctagga 
aggtcccact 
agccatgttt 
ccttatttta 
ttacagataa 
cattgaaagg 
aaaggtgttt 
acatattatt 
ttctaatcta 
tgttcctttg 
tttataagag 
atcttttttc 
ttaaatttta 
ttctatttta 
tttatatgcc 
tttagtggca 
acaggtggat 
agagtaagtt 
gaatgggctt 
tacaagatca 
gtgaactctt 
ttgtttttag 
gaattggttt 
aggctttaag 
cagaggtttt 
acttaacatt 
gaaggaagtt 
gtggcatgtg 
atacaggtaa 
ttaagataca 
ggtcgttccc 
agtcaccatt 
gacattctct 
ttcacactct 
aaataacaca 
tgatggcaag 
aactctaatt 
tatttggagc 
accagacttc 
tagtcacttg 
tctttttgag 
ttttgttgga 
aatttagaca 
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ctcgattctg cctctttaca agaatacagg tactcagttg atttgttttc tcactccctt 121140 

tctttgctat aagtttaaat caacaatttg tttaggttaa tatgtcctca tggaatggtg 121200 

gaaatgatca gatataaaat atttggtttg gttagtttac tctttatatg tttgctggca 121260 

aggaaccaca aatccagttt agtataattt ttactctagt tcactaaaag tttgcatcca 121320 

gctgtgtagg tagtgtttgt ttcttgttaa cttttttttc gtctaaaaga atactttaaa 121380 

acctt:caat ctcaaatgac tgtaacttgc tgacaggtgt taacagaaga agtagatctt 121440 

Ltcgccccct gcttatgacc tgtattttaa tatttgagct tatagattag agattgtgag 121500 

agaaatctgt ttatagtctt attttccctt gtgtattttt tcttcctagt acatggaaaa 121560 

agaggatgca gtgaatatct tacaattctg gttggcagca gataacttcc agtcccagct 121620 

tgctgccaaa aagggccaat atgatggaca ggaggcacag aatgatgcca tgattttata 121680 

tgacaagtga gttatattga tagatggatt cagcagatac ttattgaaca tttgatatgt 121740 

tttgtggaaa taaagatgaa taaactcagt ctctgttgtc aaggagctca caggaggcag 121800 

cdtaaaagct gcttttatat ggtgtttgta aagctttggg ggttcttaga acaaaagttt 121860 

ctgrr tji^aa aggggaggtg tatgtggggt aaacaggatg gcaatggtgg tgttcaagga 121920 

gtgtttccca gaagagagat tttgtttgga tcccaaagaa agaagggaat tttgctaccc 121980 

ag.» i ta jqca gaaaacaaca ttctaggcaa aggcattggc ccagaagcca tggaaacgta 122 04 0 

gggnaaagtg gcactttcaa gaaacttgag tttagataat caaaggagtg gggaataaat 122100 

atga^gat jc tggtactaat tggaatagat tgtaagggac cttgaatgcc tatttatggg 122160 

catartatac tttctgtata aatctgctca ggcacgttgt taattagttt tttattagtt 122220 

ttcact^»aa atgagaggat ggaaacatca tacagtaaac aaaattgaaa atatctggtc 122280 

aggcj'iar^a tgagcttgtg gccagctctg taacgtatgg tattcttttc atttaacttt 122340 

tctt *rtct 9 taaaaaaagt aattcgtggt cgggcacggt ggctcactcc tgtaatcaca 122400 

acac: ttq.*<g aggcagaggc aggtgaatcg cttgagccca ggaatttgag accagcctgg 122460 

gcaacat-ii: aaaacccgcc tttactaaaa atacaaaaat tagctgagcg tgatggcgtg 12252 0 

cgcctqttgt cctagctact taggggcctg aggcagaagg atcacctgag ccttgggagg 122580 

tcgag^rt agcgagctgt gatccactgt actccaccct gggcagggca gtagagtgag 122640 

accctgtctc caaaaaaaaa aaaaacaaca aaggtaattt gttatttgta tccttaagca 122700 

aatgctJdjg gggtaacttg gggatagaga aaagtccaca gatgttaggg tttgaagaca 122760 

ctaatagtat ctaggccagt ggttcctgaa cattagtctg tgggctcttg ctgggctgtc 122820 

tgcatagiaa tcacctgaga gcttattaaa aataggtttt caggctggtt gcggtggctc 122880 

acgcctataa tcccagcact ttgggaggct gaggcaggcg gattacttga ggtcaggcgt 122940 

tcaagaccag cctggccaac atggtaaaac cccgtctcta ctaaaaatac aagaattagc 12 3000 

caggcatgat ggcacacacc tgtaatccca gctactcagg aggctgagga aggagaattg 123060 

ctcgagcccg ggaggtggag gttgcagtga gcggagatca tgccactgca ctccaggctg 123120 

gctgacagag ggagactctg tctcagaaaa aaaaaaaaaa ataggttttc agtctgggta 123180 

ccggcggctc acacctgtaa tcccagcact ttgggaggcc aaggcaggca gatcacttga 123240 

ggtcaggagt tcgagaactg cctggccaac atagtgaaac cttgtctcta ctagaaacta 123300 

caaaaaatta actgggcatt ttgacgggtg cctataatcc cagctactag ggaggctgag 123360 

gcaggagaat tgcttgaacc cgggaggcag aggactgcat ctcaaaaaaa aaaaaaaaaa 123420 

aaaggtttcc agtccccctg tctcagaaat tctgattctg caggtttgag gtgtgaccag 123480 

gaatctttat tttnagaaga cataccagat aattctgata aatagccagt ttagggatgt 123540 

agtctaattt tcctattttg caagtaagga aaataaggcc cagagaggta atgattttct 123600 

caaagtcaca gaacaagtta gtggcagaat ttggactgga atgcagttct taatgttctg 12 3660 

tccagtgttt attctggtac agtatgtttg tagaaggtat tacgtaagaa acattgttat 123720 

atagatgttg agataggaag agtttacatt tagaaatttg gtctaaaatg cctgaacatt 123780 

caagtcgtgg aggagtattg accaacttac tcaatacaac ataggagatt cacattttgt 123840 

tacaaaaatg ctgatttaaa aggagagttt tctttttttt cttctttttt attttttgag 123900 

atggagtctt gctctgtcac ccaggctaga gtgcagtgac acgatctcag ctcactgcaa 123960 

cctccacctc ctgggttcaa gcggttctcc tgcctcagcc tcctgagtag ctgggattac 124020 

a ggtgggggc caccacgccc agctaatttt tgtattttta gtagagacag ggtttcacca 124080 

tgttggccag gccggtcttg aactcctgac ctcaagtgat ccacccacca ctgcctccca 124140 

aagtgctggg attataggcg tgagccactg tgcccagcct gcttgttttt gtatcatata 124200 

tatgcatcat cataatcatg cattatcaac ctttgtattt ctgtcaggac atagaaacca 124260 

ttagagtgct tggaagagag cctttttttt tttctcgcat ttaatgcttt ttttggtatt 124320 

catttcataa tcagcttacc aaaacattac ctgcattata ccccatcaag gtagaaatct 124380 

ttgtgttatc aatattggtt actccctttc cacaccgagt catcagtaag tcctgttcta 124440 

tccaaatagg tcatatgcat ctagctcacc cctcagtgct gttttgtttt gaatttgtac 124500 

atgtttactc ctgatgcctt gtagttatga tgatgtgttc ttattttatt ctgtgcatac 124560 

aagttctcag ctcgcttttt agggaaaatg accatgtctt cctttcctat aaattccttt 124620 

ctatctatca agtcctcaac agagaatagg tacccataaa tatgtgattg ttagtttctt 124680 
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tgcctcagtt 
actaaggatg 
agtgggaaca 
tgcaggctca 
atacttgcta 
ccaagccaca 
ctgcagggaa 
aaccatggag 
agaactcata 
gtccagcaaa 
ccacataaca 
gatcaaagct 
aaagaaaaaa 
taatggagct 
tttagagtgc 
tccttcagga 
gttactgccc 
gatgatcctg 
acaaatttaa 
atgaaaatat 
caaaaagcta 
aaaaaaattt 
acaataatgt 
acttccagtc 
tcttttatac 
taccattgca 
ccaaaagcaa 
tgtataagta 
attgattgat 
gtgcagttgc 
tgcctcatcc 
tgtattttag 
ttaagtgatc 
gcctgggcag 
atgcacgttt 
tattaacata 
ttattattac 
gcatgcacct 
gaattaaagg 
aagaccctgt 
ccacatatat 
acaaggtatt 
ctgtaatccc 
gaccagcctg 
gtggtggtgc 
acccgggagg 
caagagcaaa 
atgtttgtga 
gccagtgagg 
acgctattat 
tagaagtcaa 
tgatgataat 
atctgtatca 
aatgcaaaaa 
cactgtcaag 
actttgagta 
ctgctatcct 
gcctgtaatc 
aagaccagct 
ggtgtggtgg 



gtagtctgat ccttacagct 
gttggcaggc agatagaaag 
aagaaaggtt acatcagcac 
atcaagtagc cttgtataag 
tggaatttga ttttacttcg 
catcctcttg gatttgatga 
ggtgggccac tccccaactg 
aaggtaaccc agaacttcaa 
aaatataagg tgggaaaacc 
gtgattaaga gcagaggcct 
ttttagtcaa cagtggactg 
ggtagtgcaa taataacaaa 
gaaaaactaa aaagataaaa 
gaaaaatctc tgttgcctca 
tccttctact tactaagaaa 
ggtttccaga aggaggcatt 
ctgaagacct tccagtggga 
accctgtgta ggcttaggct 
aaagaaaaaa aaaattaaaa 
ttttgtacag ctgtatatgt 
aaaaaagtaa aacagttaaa 
taaataaatt tagtgtagcc 
gctaggcctt cacattcact 
ttgcaagctc cattcatggt 
tgtattttta ctgtgccttt 
atagtggcct acgatattca 
taggttgtac catatagcca 
cactctgtga tgttagcaca 
tgattgattg attgattgag 
acagtcttgg cacactgcaa 
tcccaagtag ctgggattac 
tagagacagg gtttcaccat 
tgcctgcttt ggcctccgaa 
taactgaaat tctctaatgc 
acctcaaagt tactttgatg 
gtacctgaca catggtaagc 
gtatttttaa ataattagag 
atagttccag ctactcagga 
ctgcagtgag ccgtgttcat 
cttgaacaat taaagaaggc 
caccagtaac tgtcaacagg 
attaatagct tattaataat 
agcactttgg gaggccgagg 
accaacatgg agaaacccca 
atgcctgtaa tcccagctac 
cagaggttgc agtgagctga 
actccgtctc aaaaatataa 
atgatttatt cttctaatga 
Ctatgttgct tgtatgtgtc 
ataataccat acataaaaac 
atcattaaat agctagtagt 
attttcacga ttaaaattaa 
tataaagaat gtaaattttc 
tagttcttac tagatgtgtg 
accctggtag ttaggtagga 
tttctgtgct agatggtagt 
ggagcttagt ctacaaaaaa 
ctagcacttt ggaagatcga 
tggccaacat ggcgaaaccc 
cggacacctg taatcccagc 



tttaaacaac 

gtagcaagtt 

tgtcatcaca 

attctctgga 

gatatctttt 

tgttgtacga 

tttcacaact 

acgtatcaaa 

aagcagaata 

tgagtctggc 

cgtgtacgat 

agttagaaaa 

gaataaccaa 

tatttactgt 

acagttaact 

gttatcaaag 

caagatgtgg 

aatgtgggtg 

atagaaaaaa 

ttgtgtttta 

aagttacagt 

taagtgtaca 

taccactcac 

aagtgcccta 

tctgtatttg 

ttatagtaac 

aggggtgtag 

atggcaagca 

acagagtttc 

cttctgcctc 

aggcaggcac 

tttggccagg 

agtgctggga 

cattttcctt 

attaaagtaa 

atcaaaaaat 

agcagtatca 

ggctgaagct 

gcccctgcac 

attatgccgc 

attggaaccc 

aaagcgttgg 

tgggtggatc 

tctctactaa 

ttaggaggct 

gatcgcacca 

ttataataaa 

actagaggag 

atgtgtatcc 

tgaattttag 

aaacagaata 

accttttctg 

agggtaataa 

tatgtaagga 

aaaaagacat 

gttacagtgg 

ggtacatatt 

ggcgggtgga 

cgtctctact 

tactcgggag 



agtagagttc 

gacccaacta 

tagctctata 

ggaggtgctg 

taccataggt 

ttagaaattg 

ccattacgtc 

ctacaagaag 

gcacagtgga 

ctggtatgta 

ggtcctgtac 

aataaatttt 

gaacaaaaca 

actatacttt 

gtaaaacagc 

gagatgacgg 

aggtgaaaga 

tttgtcttag 

gcttataaaa 

agctgttatg 

aagctaatct 

gtgtaagtct 

tcgctgactc 

tacagatgta 

tgtttaaata 

atgtgataca 

taggccatac 

gcctaacgga 

actccattgt 

ccaggttcaa 

caccatacct 

ctgttctcga 

ttacaggcat 

atctgtaaag 

ggtaatgtat 

gttaactact 

aaaattagct 

ggaggattgc 

tccagccttg 

aacgttagct 

tagttttggg 

ctaggcacgg 

acctgaggtc 

aaatacaaaa 

gaggcaggaa 

ttgcactcca 

taaataaaag 

atttttccag 

aggtgaaaaa 

gaatactgaa 

gagtgtcagc 

attttaaagg 

aattaaaatg 

acttagacta 

gaatgattca 

taaacaaaat 

ggccgggcac 

tcacctgagg 

aaaaatacaa 

gctgaggcag 



accgccaaga 
tctccgggga 
gttctaggcc 
aaagttgctt 
acttctccct 
aatccaatat 
aggcctggac 
ttttattggt 
aattgaagca 
cagtcacgtg 
gattataatg 
aataagtaaa 
aaaaaaatta 
taatcattat 
ttcagacagg 
ctccatgcgt 
aagtgttatt 
tttttaacaa 
taaggatata 
acaacagagt 
attattaaag 
acagtagtgt 
acccagagca 
ccatttttta 
cacaaattct 
ggtttgtagc 
catctaggtt 
aattctgttt 
ccaggctgga 
ccaattatcc 
ggctaatttt 
actcctgacc 
gagctaccat 
tgacgataat 
ataaaataca 
tttattacta 
gggcgtagtg 
atgagcctgg 
gtgacagagc 
tagaaatgat 
tattatgatc 
cgactcacat 
aggagtttga 
ttagccgggc 
aatctcttga 
gcctgggcaa 
taaagtattg 
gaatttcaga 
acttaattaa 
gaatgacata 
tgttacccaa 
aaaagttcag 
cagagagaaa 
attttaagaa 
ttcaacaaaa 
aaatgtgttt 
ggtggctcac 
tcaggagttc 
aaattaactg 
gagaatcact 
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tgaacctggg 

ggacaaaagc 

tacgaacata 

ccagttacaa 

tcagtataat 

tttaagaatt 

attgaaattg 

taactcagca 

aatatacgca 

atcctatgga 

acaaatattt 

ttttttttta 

tcactgcaac 

ctgggattac 

gggtttcatc 

tcggcctcct 

taattttaat 

catttgttat 

taaaacagtt 

gtctttttgc 

cattcggttc 

gttggccctc 

actgattgcg 

acagatccta 

tggcctgtgc 

aggaattcga 

attagccggg 

gaatcgcttg 

agcctgggtg 

tagcctactt 

aaagtatcat 

aataaatcaa 

tatttaggat 

ttcttgagac 

ccacagcctc 

attttttttt 

ggcaggcaga 

tctctactaa 

cttgggaggc 

agatcacgcc 

tttttttaaa 

aatgccgcct 

gttggccagg 

agtgctggga 

tgatttcgtt 

tggagtgctt 

atcctaagac 
cagtaaatga 
agcataataa 
ggccttggat 
tctcctgagt 
tgccattttg 
gtttttgttg 
atggcacgat 
cagcttcctg 
ttttagtaga 
gatccacctg 
ctgaatgtct 
atttttatac 
caccttgaag 



agacagaggt 

gaaaatacgt 

aatatttaca 

acttttcctt 

taataattat 

atttgaaaaa 

ataatgttct 

atcacacgcc 

caaagacttt 

ctattttctg 

tgggggagaa 

gacagtcttg 

ctccatctcc 

aggcgctcac 

atgttggcca 

agagtgctga 

ttgtaaactg 

aggtagttaa 

ttagttggat 

ctggcttttt 

gaggagatga 

ctgatgagtc 

tctgccatta 

ttatttgtaa 

ctgtaatccc 

gaccagcctg 

catggtggca 

aacccaggag 

acaaagtgag 

actatcttct 

gctgtttcat 

gtaatatgga 

actttttgta 

agggtctcct 

ctgaatagct 

ggccaggcat 

tcacgaggtc 

aaatacaaaa 

tgaggcagga 

actgcactcc 

tgatggagtc 

gcttcagcct 

gtagtctcaa 

ttacaggcgt 

ttgcattacc 

tcatatgtta 

aagaaatcta 

tagagccaga 

ttttctaatt 

aaattttcct 

atctttttct 

gttttttgat 

ttgtcattgt 

cgtggctcac 

agtagctggg 

gaaggggttt 

cctcggcctc 

tgtttttgat 

cttttgttga 

aatgttcaca 



tccagtgagt 

ctcaaaaaaa 

aattatactg 

cgtagaatta 

taaacgtaaa 

aaaacaatgt 

tttgaagagt 

tggtgagtta 

aacatttatc 

ctaaaaagta 

aacccaacaa 

ctccagcgtc 

caggttcaag 

caccatgcct 

ggctggtctt 

gattacaggt 

tacaaaggga 

catttgtaac 

ttgatttcaa 

gtccagcaat 

atttctgggc 

tcacccaggg 

gggagaaaag 

attttaagtt 

agcactttgg 

gccgacatgg 

ggcacctgta 

gcagaggttg 

actgtgtctc 

aatcaaagca 

ttaggccatt 

atatattcat 

aaataagtga 

cgctgcaacc 

gggactagag 

gatggtccac 

gggagatgga 

attagctggt 

gaatggcttc 

tgcatggtga 

ttgctgtgtt 

aagtttcttt 

actcctggct 

gaaccactac 

gtgccacatt 

aaccatacct 

aggaggcata 

aatattcccc 

actgttgaca 

aatttgtaag 

tctgttaagt 

ataggttaga 

ttgagacagc 

tgcaacctcc 

attacaggca 

caccatgttg 

ccaaagtgct 

taggcactta 

tactatatat 

agcagaacta 



cgagatcatg 

caaaaacaaa 

aataagttct 

gaaatataaa 

taaaaacatc 

ggaaacagat 

aaagtgacca 

tcttaaggaa 

ataaaccaga 

ttaataccaa 

aattacatgc 

caggctggag 

caactctcct 

agctaatttt 

gaactcctgg 

gtaagccact 

taatacttgt 

cagtagaatt 

ctttaaaata 

ctttattata 

gggaacgtgt 

agttctgaca 

catacacatc 

gtggaaaaaa 

gaggctgcgg 

tgaaacccca 

atcctagcta 

caatgaacca 

aaaaaaaaaa 

tttgtggtaa 

attctatttg 

agcctctgaa 

atgaattctt 

tggaaattct 

gcatgcacca 

gcctgtaatc 

gaccagcctg 

tatggtggct 

aaccagggag 

cagagtgaga 

gctcaggctg 

tttttttgta 

tcaagcagtc 

ctataatgtt 

gtgcatttcc 

gattctcctc 

aagaagttaa 

ttctagtgtt 

aataaataac 

agagtattat 

ttacctagga 

atgtcttggt 

atcttgctct 
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atagaaaact 

acccatgtga 



ccactgcact 

caacaaaggc 

catgtttatt 

caataaacat 

tatgtacaat 

attttgatat 

tatatattaa 

atcagtttga 

aaaatcgagt 

ctttatgtaa 

attgtaattt 
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aaaatcacac 

ctggttttat 
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ttcaagcaat 

cacctggcta 
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tttttttctt 

aatccaccca 
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cagtgagccg 
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ttgggttttt 
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cagcactacc 
caccagggca 
at 1 t t.ataaa 
acatccacat 
acaccgactc 
atcccccact 
tagatttgct 
tgccaactgt 
gcatgtatca 
ate ttatttg 
gtgtacaagt 
t tger ctgtc 
tcct gggr tc 
accacracgc 
tct cq.iactc 

tdg.«.it t get 
cga grjet 
cat 1 t cat tg 
gttct icttc 
att tqrt t at 
ttt tar tt 1 1 
ggct g iagta 
attct cct qc 
egggctgatt 
caaart cct g 
cgtgagccat 
tgetet tgtt 
tcctgjjt tc 
accarricac 
aggct ggt ct 
gattaceggc 
tcaaactcct 
gcgtgagcca 
gagttt cagg 
caggt at 1 1 1 
tgtttgtttg 
gcagtggcuc 
cttcagcctc 
ctgtttgt ag 
gtgattctcc 
ctgtctt t tc 
ggcacccacc 
atgttcaccc 
aaagtgctgg 
gtattgtctt 
tcagcagaat 
tatgttttaa 
agattgtttg 
tcaactttga 
tcaatatata 
cagagcaatg 
agggaaagct 
cttttttcag 
ttaaatcatg 
tatggatttt 
caaccttgca 
tgctggattc 
atttacactc 
attctcttcc 



agcccctcta 
gctactatcc 
taaaaccata 
catacaattc 
tagaacattt 
ccaccagccc 
tattctggac 
ctttcactta 
gaattttatt 
aattgtaccc 
tcttgtgtgg 
gcccaggctg 
aagcagttct 
ccagctaatt 
t tgacctcaa 
cactatgccc 
gagtcaagag 
gcaccatttt 
gaacttatta 
agtgtggttt 
tggcctttgt 
atttttattt 
caatggtgtg 
ctcagcctcc 
tttgtatttt 
acctcaggtg 
tgggcccagc 
gcccaggctg 
aagcgatttt 
ccagctaact 
caactcctga 
atgagctacc 
gggttcaagt 
ccttgctcag 
agtcctttat 
cttcattctg 
tttgtttgtt 
aatcacagct 
ctgaatagct 
agacagatct 
cacctctgcc 
actattaata 
accatgcctg 
ggctggtctt 
gattacaggc 
ctaatttgtg 
tttgtagttt 
gttcttttcg 
atgagagagc 
taaatctgat 
agatcatgtc 
atgagtagaa 
ttcagtttca 
attcaggaat 
aaagggtgtt 
gggttttatt 
tacctgagat 
catttactgg 
tatcagaaat 
attccaaaga 



gaagccctct 
tgacttttga 
ctgtgtattc 
agtggttttt 
ttttcactcc 
taggcageca 
atttcataaa 
gcatcatgtg 
cctcattatg 
tcctttctgc 
atacaggttt 
gagtgcagtg 
cctgcctcag 
ttttagtaga 
gtgatccacc 
ggctgtggtt 
gtaactctta 
gcaatcccac 
tctgtttggc 
ttgeacttet 
tctagctttg 
atttattttt 
gtctcagctc 
cgagtagctg 
tactagtgac 
atctgcctgc 
ctagattttc 
gagtgcaatg 
cctgcctcag 
tttgtatttt 
cctcaggtga 
aggcccagcc 
gatcctcctg 
cccctttgcc 
atattctaga 
tgagttgtct 
tttttaagat 
cactgccacc 
agggecatag 
tactgtgttg 
tcccagagtg 
gtgtcttcct 
gctaattttt 
gaactcctga 
gtgagecact 
aacatggatg 
tcagagtaga 
attccattat 
atagaaatac 
tgttagctct 
atttatggat 
gtggcagaag 
tcatttaata 
ttccctatca 
gaatattgtc 
ctgttgatgt 
gaatctcact 
tattttgttg 
gaattgacca 
tagacataca 



tgggcccctt 
tggcatagat 
ttttcttgta 
atatggtcac 
agatagaaac 
ctagtctact 
catggaaccg 
ttcaaaagag 
gecaaatate 
catttatcaa 
tctttttgtt 
gcacaatctc 
cctcccgagt 
gatggggttt 
catctcggcc 
ttcatttctt 
aacttattga 
cagcagtgta 
tgtttttaaa 
ctgatgagta 
gaaaaatgtt 
ttttgagacc 
actgcaacct 
ggattacatt 
agggtttcac 
ctaggcttcc 
ttttttcttt 
gcacaatctt 
cctccccagt 
ttttagagac 
tccacctgcc 
aattttctca 
ccttggcctc 
catttttaaa 
taaatgtccc 
ttcctctacc 
aaggtctcat 
tcaacttcct 
atacacacta 
cccaagttgg 
ctgggattac 
gcttcagcct 
ttgcattttt 
cctcaggtga 
gcacccggcc 
tatcttcatg 
agcctttcac 
aaatagaatt 
aagtgatttt 
aatagttttc 
agagatagtt 
caaaaatctt 
tgatgttagg 
ttcctgattt 
atgttctttc 
gaaatattaa 
tggtcatggt 
aagattttgt 
taaatgtgag 
tccgtctgta 



ccattcactg 
tagcattacc 
cagctttatt 
agagttaggt 
cccctttact 
ttttatctct 
tatattatgt 
catcatgtta 
ecattgeaag 
taatgetact 
tttaaatttg 
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ttatcaaatt 
ttttaaaaaa 
tctgctgccc 
gggccgaagt 
tcacacccag 
tctcaaactc 
aggtgtgagc 
cccgagtagc 
agtagagaca 
ttcacctgcc 
aaaatattgc 
tatttatgtg 
ctccttgggt 
gttttcttaa 
tacatgttga 
ttgtggattc 
ttttttctgg 
tgtcttgttt 
tgtgggtttt 
tttaaggctt 
tgtatcagta 
ttgattttca 
gtataatctt 
atetgaaege 
agtgtatttg 
tgtctgtctt 



tccttcttgt 
tgttcttgtc 
gtgetaatte 
aaccattacc 
taaactccaa 
atagagacaa 
ggtcttttgt 
tccatgtttg 
gatttatgac 
gtgaccattt 
aggtggagtc 
aacctctgtc 
ataggcaege 
gecagtctgg 
ctgggattac 
tacataggag 
gattgttttc 
agcttctcca 
attccaataa 
gcatcttttc 
etttggecat 
ctgtcagcca 
tgttcaagtg 
gccagcatgc 
aggctggtca 
ggattacagg 
gaaggagtct 
aacctctgcc 
acaggtgect 
catgttggcc 
gaagtgctgg 
caggctggtc 
gggagtacag 
tttttatatt 
atattatttc 
ggtgggtttt 
aggctggagt 
gatcctctta 
cttttttttt 
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tgggattaca 
gtgtttcacc 
atggcctccc 
cttcttaaca 
ttctttcatt 
catttattcc 
tttcattttc 
tettgeaact 
tttaggattt 
ctagaactta 
cctatctgac 
caataaatgc 
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tgggttcttg 
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ccatactctc 

gattttggta 

ctggtccctc 

tgagagccgc 

aattatttta 

tattttaaac 

tctctctgta 

cacaaaaata 

aggagaactt 

tgtcatatgg 

tactacccaa 

tgtgagtaca 

ttatacattt 

aggccagcgt 

guttctcatg 

ctaatttttt 

ctcttggcct 

ggt aaaagat 

gj9actgagg 

gcqccattgc 

tttaatattg 

gtt aagcatc 

aaaattttgt 

at tttacaat 

t t tgargaag 

acatatgccg 

tcttttaaat 

actttcactt 

tcctgggttt 

gtttciccat 

gyccgcataa 

tcattttagc 

ccacagatcc 

aggtgtgtaa 

gaaaaaggga 

cgcagtggct 

gtcaggagat 

aaaaattagc 

aggagaatgg 

ctccagcctg 

ttgggagact 

tgtatttgga 

cttttagaaa 

tgggaggccg 

tggtgaaacc 

gtagtcccag 

ttgcagtgag 

ctcaaaaaat 

aaagatgatg 

aattgcgtga 

tgttttgttt 

gtgcaatctc 

cctcccaagt 

ttagtagaga 

tccacccgcc 

tgtatttttt 

gaaagattat 

ctcggcaggc 

ctactaaaaa 

atctttctga 



ttgattacta 

tctgagtaac 

tttctttttt 

tttccctacc 

tttacttaac 

aatgctgcag 

actgaatttt 

ccaattctgg 

gctgtactca 

cctcaccaag 

gatacatatt 

gcacatttgt 

tattttattt 

gcagtggtac 

tctcagcctc 

tatttttagt 

caagtgatcg 

gaatattgag 

tgggaggagt 

acttcaacct 

tgctttcctc 

ttaatagtga 

acaaatctgt 

ttctattgta 

cgataattgt 

ggtaagctta 

tttttaatta 

agagtgtgca 

taatcttggc 

ctgtaaaatg 

ctgatacttg 

ttgtaataaa 

aatcataaaa 

aggcttgatt 

tttcaacata 

cacacctgta 

tgagaccatc 

cgggcatggt 

cgtgagcccg 

cgcagtggag 

gaggagccta 

aagccagctt 

ataacggaca 

agacgggcgg 

ccgtctctac 

ctactccgga 

ccaagatcac 

aataataaaa 

ttattcttaa 

tttttcactt 

tggtttttcg 

agctctctgc 

agctgggact 

tggggtttca 

ttggcttccc 

aaatgtataa 

cttcaccagg 

ggctcacttg 

taaataaata 

aatttttctc 



ttgctttgta 

agtcctcata 

gtttaactgt 

ctcccacccc 

ctattactta 

tgaataatct 

tagaagtgga 

tttttcttgt 

gctggccagt 

aatcaaaaac 

tctggatgta 

tggaacttag 

tattttttag 

aatcttggct 

cagagtagct 

agaaactggg 

gcctgcctca 

ggctgcatgg 

cctggagccc 

aggaattata 

ttttagctat 

tgaggtrtgag 

cacattccaa 

gtccagtgtg 

ggatgcggca 

gctcatgcct 

aattttacat 

gtataatgtg 

tctgccattt 

agaataataa 

gaaaaagtat 

aagatagtga 

tcactttctc 

acactaccct 

tttaattact 

atcccagcac 

ctggctaaca 

ggcaggcacc 

ggaggcggag 

cgagactctt 

gaaagtactt 

tttcagctgt 

aggccgggca 

attacctgat 

taaaatacaa 

ggctgaggca 

accattgcac 

taaaaaagaa 

gggatggttc 

ctgtaattgt 

agacggagtc 

aacctctgtc 

acaggcacgt 

ccgtgttagc 

aaagtgttgc 

aatgaagcag 

cgcagtggct 

agttcgaaac 

aagatggttt 

aaggcaagta 



ataagttttg 

gaattagttg 

gtatcttgga 

tgctatagag 

gtcggggaca 

tgtatataag 

atttctaggt 

ggaggtgggg 

cattttagaa 

attcctattt 

tgacagcttt 

gtcgttaaga 

tttttgatac 

cactgcgacc 

atggttacag 

tttcaccata 

gcctcccaaa 

tggctcatac 

aggagggtga 

ggcttcagtc 

agtatgaggt 

tgaaagttac 

gcccaggact 

aaaaaagcca 

agtctggatc 

agaattttta 

ttttttctaa 

gtggttaagt 

attggcagcc 

agtgaaaaga 

tcctttgagt 

ttcataggat 

ttccctaaag 

gatccgtacc 

ttcagtagaa 

tttgggaggc 

cgatgaaacc 

tgtagtccca 

cttgcagtga 

gtctcaaaaa 

gaaggaagta 

gtcagctttg 

cggtggctca 

ctcaggagtt 

aaagttagcc 

ggagaattgc 

tgcagcctgc 

tggacagtaa 

atttatttaa 

gtgtatgtat 

tcgctctgtt 

tcccaggttc 

gccaccacgc 

caggatggtc 

tattacaggc 

aaaagagaaa 

cacacttgta 

cagcctggcc 

taatatatgt 

aatttgtatc 



aaatcagaaa 

ggaaatattc 

gattgttcct 

aggtctataa 

ttaagcttgt 

tcattttcca 

caacctatgg 

agtaggaggt 

aggtttcctt 

accctgtaaa 

tcatattgaa 

atgtcttata 

agagtcttcc 

tccatctcct 

gcatgcacca 

ttgaccatgc 

gtgctgggat 

ctgtaatccc 

ggctgcagtg 

actgtgcccg 

tacatttcag 

ttctatttca 

gattgtttca 

gtattaaaat 

cagaatcttt 

caagtgtaaa 

tctattatta 

ataaaggctc 

gctaacctct 

tgccaacatc 

ttaagaatta 

atgccactta 

atagcttgat 

ccagttccca 

agtaacagtg 

cgaggtgggc 

ccgtctctac 

gctacttggg 

gcttagattg 

aaaagaaagt 

aaaggtttgt 

tgtagtgatt 

cgcctgtaat 

cgagaccagc 

gggcgtggtg 

ttgaacccgg 

gcgacagagt 

acctaaatga 

gaccttacat 

aatgtaaata 

gctcaggctg 

aagcgtttct 

ccggctaatt 

tcaatctcct 

atgagccacc 

tgataatttt 

atcccagcac 

gacatggtga 

tttagtttta 

agttggtata 



gtataaatga 

cctctttatt 

tctcaacaca 

gtgtctgttc 

ttatgtcttt 

tcaatataag 

ctctgtattt 

agaatgctgg 

agcttctttt 

catggggctt 
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tctgtcgccc 

999Ctcaagt 

ccatgcccgg 

tggcctcgaa 

ccttgtattg 

agcactttct 

agttgtgatc 

gcatgtacat 

agtcattgtt 

aacactgaag 

tatacttcta 

actgaaaaat 

atatcaacgg 

taactt tgca 

tatgcccaga 

tggagtgact 

tggtatctca 

atttactctg 

agttggttat 

ctgaaattta 

taacatgtaa 

gcagcaccat 

gtaggccagg 

ggatcacgag 

taaaaataca 

aggctgagac 

tgccactgca 

aacagtggta 

ttgaccacat 

tttagttctt 

cccaccactt 

ctgggcaaca 

gcgtgtgcct 

gaggcggagg 

aagactctgt 

gttcattccc 

aaagtctatc 

tatatgtttt 

gaatgcagtg 

tctgcctcat 
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gacctcgtga 
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tttgggaggc 
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tctatgaaat aacttattag gaagatatct ctaaaataag atcactttgc ctaaaataaa 139140 

ctgatatatt gatgttcaca gaatttttct tttaaccgac ttgataaatg cattattctt 139200 

gacgtcaagt gatccacctt ccccagcctc ccaaagtgct gggattacac acatgagcca 13 9260 

ccgcacctgg cattattctt ataaaaggtt aaatttctag ttaagtttaa tgtcctcttt 139320 

gttcatgtac cattgcttat tttcttccct tcctactcac agtaatcatc cttatggtat 139380 

gcacttttgt ttgcttattt ttatgtaatt gatattacgc tccattctgt acgttgtact 139440 

ttcattcaca gtgagttttg gacattccta tgttcatcta tacagactta cttcatttta 139500 

actacactgt agtattccgt atgtaatatt tactataact catcactgta gcagagcatc 139560 

tcatagtgta tgtattactg ttttgccatt ttggtatcaa tgagtattta agtcatttgc 139620 

agtttttccc tcttataccc agtattacag aggatctctt tttatatgct tctttgtacc 139680 

aagaggcaga ttaaaaaatt tttttttgaa aaaatttttg aaaaaaaatg aaatgaagtc 139740 

tcactatgtt gcccaggctg gtctcaaact cctaggctca agcaatccct ccatcttggc 139800 

ctcccaaagt gctggggtta caggcatgag ccaccatgcc tggcctacat tttaaatttt 139860 

gatagctctt acaatttact ttgtaaagta tctgcatcat tttatgttct caccagtctt 139920 

taataagaat acttcatact tttggctgga cacagtggct cacgcctgta atcccagcac 139980 

tttgggaggc cgaggcgggc agatcaagag atcgagacca ccctggccaa tatggtgaaa 14 0040 

ccctgtctct actaaaaata caaaaattag ctgggcgtgg tggcgcaccc gtagtcccag 140100 

ctactcgaga ggctgagaca ggagaatcac ttgaacccgg gaggtggagg ttgcagtgaa 14 0160 

cttagatcac accactgcac tccagcctag caacagagtg agactctgtc tcaaaaaaaa 140220 

aaaagaatac ttcagactta attttttttc cagtcttaag tgtttgctaa tgagattgag 140280 

tttcttttgg tatgtctctt gattgttcag gttttttctt ttatgaattg actgttcatc 140340 

tctttttcac attatttctg ttgggtgatt ttattagtga cttgttaaaa ttctgtatat 140400 

tttttcagca tgacacttca ttattcaaaa aaaaaaaaag attctctatg tttctcgata 140460 

ctaatcattg gttggtaata ccttaaaaat aagaccctta ctgtattttt tgcttttttt 140520 

tttttttttt tttttttttt tttgagatag agtcttgctc tgttgcccag gctggagtgc 140580 

aatggtatga tctcggctct cagctcactg caactgcaac ctctacctcc ctgtttcaag 140640 

caattctcct gccttagcct cccaagtagc tgggattaca ggcatccacc accacaccca 140700 

gctaattttt gtatttttag tagagacagg gtttcaccat gttggccagg ctggtctcaa 140760 

actactggcc tcaagtgatc cgcctgcctc ggcatcccaa agtactggga ttacaggcat 140820 

gagccacagt gcctagccac tttttgcttt ttaactttgt tttatagtac tatagtttta 140880 

gtataaacag atgtatgtat acacacaact atggctttat aatatgtttc agtcattgtt 140940 

agagcaaggc ctaccttttg ggtgcttctt ttacaaaatt gtcttggcta ttcttgtgcc 141000 

ttttttctta tttgtgaatt ttagaattgt gaattacctg ttgactcacc atgttttgta 141060 

aactgaggat tttgaatgga attgcactca attaaagatt atcttgcttt ctgtgcagca 141120 

atgttttatt tcaaataatc cctactttaa attacttagg atagctataa attgtgtttc 141180 

tggctttcta gatttagatg aaacgcttta aattgattgt tttctcctaa atttaaaact 141240 

gattgttaga agttaaagtc ttctgttcat tcttatttag gaagatgaca tttggaagag 141300 

tcagtgactt ggggcaattc atccgagaat ctgagcctga acctgatgta aggaaatcaa 141360 

aaggtttgtg gtgtttttat acttcatatt aagcctttac tcacattagt gattgactgt 141420 

aagtcaaaga ccacttaagg tttaaactgt ttattttgta aagtaaccac tgtatctttc 141480 

accttgtgtt tatagtcaga agtaagtaca agggcttcct gtagtcacat ctttatgcaa 14154 0 

tctcctctga atcaaaagtt agtgaacttg ctttgccact ccagaaggca catgaatatg 141600 

aaaaagcatt gtctattttc ttatttaatg gcaaaatacc cgacctaagt tggacttaat 141660 

gtttgagacc gtttatttta ttaaattata ttttttctct tttctttttt ttttttgaga 141720 

cagttcttgc tctgtcaccc agaccggagt gcagtggtct gaccgcacct cactgcaacc 141780 

tctgcttcct aggttcaagc gattttcctg cctcatcctc ctgagtagct gggactacaa 141840 

gtgcgcacca ccacacctgg ctaatttttg tatttttagc agagatgagg tttcaccacg 141900 

ttggctaggc tggtctcata ctcctgacct caagcaatcc atccgccttg gcttcccaaa 141960 

gtgctgggat tacaagtgtg agccaccatg cctggcctta ttaaattatt tttattaaat 142020 

ttcctcaaga ttgatgaaag taatgaaata taaaagtaat gaaatatatg tggaaaatag 142080 

actggattaa gaaaatgtgg cacatataca ccatggatac tatgcagcca taaaaaagga 14214 0 

tgagttcatg tcctttgtag ggacatggat gaagctggaa accatcattc tgagcaaact 142200 

gtctcaagga tagaaaacca aacaccgcat gctctcactc ataggtggga attgaacaat 142260 

gagaacactt ggacacaggg tggggaacat cacacgctgg ggcctgtcgt ggggtggggg 14232 0 

gctgggggag gaatagcatt aggagatata cctaatataa atgacgagtt aatgggtgca 142380 

gcacaccaac atggtacatg tatacatatg taacaaagct gcacgttgtg cacatgtacc 14244 0 

ctagaactta aagtataata aatttaaaaa aaataaatat atgtggaaaa tattaatagg 142500 

tcaaaattca aattgttcat ttaatcagaa gagtagttta gtcaaatcca agggttagac 142560 

aacagaaatc ttttttgtca agtgcattct ttgtgactga tttcattttc ttcctggttt 142620 

acacaggaag atttcagaaa caaatgtgga tccgtgacag atggtatcta gaagttttta 142680 
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gt ttggttga 
attcaatttt 
cctgtaatcc 
accatcctgg 
atgcctgtgc 
gggat cacag 
tggccaggct 
caaagtgctg 
cttccccccc 
gtgcgatctc 
cctt ccaagt 
tagtagagac 
tccgcccacc 
atcctatct c 
caggcttggc 
cccaagtagc 
agtagagatg 
cctcccaaag 
ggctgcatgg 
gttggagccc 
ccacctaggt 
atactgggaa 
ggcattggcc 
gggtctcct c 
tgtttttttt 
atcttggctc 
caagtagctg 
tagagatggg 
ctcgcctctg 
ttttttgttg 
gtcgcccggg 
ttcaagccat 
cgcccggcta 
gtctcgatct 
ggtgtgagcc 
gacagctcac 
agcataggtg 
agcagttgaa 
ttcagtctgt 
gtatgcatct 
tacattatat 
atgaaaagca 
gtcaggaaag 
taagcccagg 
cgaagctaga 
gagcaaagga 
ggacttgaaa 
tttgtttttg 
ttagtgtagg 
aagtaaggga 
tgtatttgaa 
acttgtatta 
agtttggtta 
aagtcaagta 
atttgagtga 
cgtagtctca 
catatcagtt 
ttatttctct 
tcgtgttaca 
tctctgtata 



attgacagta 
gataagtatg 
tagcactttg 
ccaacatggt 
acccgccccc 
gcgccatggc 
ggtctggtct 
ggattacagg 
tcttgtttga 
agctcactgc 
agctgggatt 
agggtttcac 
tcggcctccc 
ctttcttttt 
tcactgcaac 
tgggattata 
gggttttgct 
tactgggctt 
tggttcatac 
aggagggtga 
aatggagcaa 
gaggtcagtg 
aggaagaact 
cccagtatta 
tcctgagatg 
actgcaagct 
ggactacagg 
gtttcaccat 
cct tgcaaag 
ttgttattta 
ctggagtgta 
tctcctgcct 
attttttgta 
cctgatgtcg 
accgtgcctg 
gaagaagtgc 
ttcggtgttt 
gacatatagg 
ttatgttatt 
attgttctgg 
tggcggagac 
gggtaggggg 
gcctcactga 
cagcatgtgg 
gagctcagca 
atgagcagta 
gtgtcaggga 
tcttatgctt 
gcttcaagaa 
tgatggtggt 
ggtagaggca 
ttaatttaat 
aacacccctg 
ttttttggtg 
ttactatgtg 
gtctgtttta 
atcctacaga 
ttccttgaca 
cattttagat 
ttggtctgta 



ttttattgag 
tttaagatta 
ggaagccgga 
gaaaccctgt 
gggtttaagc 
taatttttgc 
caaactcctg 
catgattcac 
gacggagtct 
aacctctgcc 
acaggcgcgt 
catgttggcc 
aaagtgctga 
ttttgtcggg 
ctctgccccc 
ggcacctgcc 
atgttgacca 
acaggcgtga 
ctgtaatctg 
ggctgcggct 
gaccatgtct 
gtggttttag 
ctacagtgtc 
atagaaaatc 
gagtctctct 
ctgcctccca 
tgtccaccac 
gtcagccagg 
tgctggagtt 
tttatttatt 
gtggcacgat 
cagcctcctg 
tttttagaag 
tgatccgccc 
gcctgatttt 
tcctgcttca 
gcagtttctg 
aaattttttc 
ccttcattca 
gtcctgggga 
agtaacagac 
ctgggagaga 

ggaggtggca 

aggaagagtg 
tgatcaagga 
gaaggtgagt 
cacattggaa 
agtgttttta 
gagaagcaga 
gtggattagg 
aaaagattat 
tgagacatgc 
tatatcctgg 
gtgtaggagc 
ccaggcacta 
ctccagcttg 
atgtttaatc 
tttcttgtaa 
tagaacacat 
cattaaaatg 



taaaagatac 
agagccactg 
gcaggcgggt 
ctctactaaa 
gatcctactg 
atttttagta 
acctcaggtg 
catgtctggc 
tgctgtgtcg 
tcctgggttc 
gccaccacat 
aggctggtct 
gattacaagt 
tgggaggggg 
caggttctag 
accacgcctg 
tgctggcctc 
gcttgtattg 
agcactttgt 
gcagtgaatt 
ctaaaaaaca 
aacagaggaa 
tttaggtagc 
tctgagctgt 
ctgtcggcca 
ggttcacacc 
cacgcccagc 

atggtctcga 

acaggcgtga 
tatttatttt 
gtcggctcac 
agtagcaggg 
agacggggtt 
acctcggcct 
tttttttttt 
tatgtatatg 
tttgttttat 
ccaaaccact 
ttcattttat 
agaaaacaaa 
aaacaaatgt 
gtagtaggga 
ttttgagtag 
ttcttggtga 
acagcaagcc 
gagttgggag 
gttggagcag 
agggattgct 
gaaacaacat 
ctggtagtgg 
atttctacca 
ccacataaac 
ttcttctttt 
ctagagattg 
tgctgaatgc 
gttccttttt 
ttctgtactt 
actggaagtt 
catgtgttgt 
ttgcctgaat 



taacttttgt 
gccaggcgct 
cacgaggtca 
t tagccaggc 
cctcaggctc 
gagacagggt 
atctgcccgc 
catttatctt 
cccagagctg 
aagcaattct 
ctagctaatt 
cggaactcct 
gtgagccact 
acagagtcta 
caatcattct 
gctaattttt 
aagtgatccg 
ggtaaaagaa 
gagactgaga 
gtgatcacgc 
aaacacaatt 
gtgccagatg 
ttctgtccat 
ttttttttgt 
ggctggagtg 
attctcctgc 
taattttttg 
tctcctgacc 
gccaccgtgc 
ttgagacaga 
tgcaagctct 
accacaggcg 
tcaccgcatt 
cccaaagtgc 
taatctggtc 
tgttagcata 
atgaattaag 
atctctgctc 
agaacagtgg 
gttcctgctt 
agcctgtgta 
gtgctatttt 
acctgagcgc 
aaggaacaag 
ccgtgtggct 
gtcaccagag 
ggaaatgatg 
ctatcagcta 
tcttgccata 
aagaccagtc 
gcaagcccat 
taataaatag 
agttgtccag 
aatttattca 
caaggatgta 
aatgaccctg 
tcctggttgt 
acacctatag 
atatggtgtt 
ggatacacat 



aagaagaaaa 
gtggctcatg 
agagattgag 
gtggtggcac 
ctgagcagct 
ttcactacat 
cttagcctcc 
attttctctt 
gagtgcaatg 
cctgcctcag 
tttgtatttt 
gacctcgtaa 
gtgcccagcc 
gctctgtcgc 
gcctcagcct 
tgttattttt 
cccaccttgg 
caatattggg 
tggaaggagt 
cattgcactt 
tttttaagga 
acctttgtga 
aaggataatg 
ttgtttgttt 
ctgtggcgcg 
ctcagcctcc 
ttatttttag 
tcgtgatccg 
ctggcctggt 
ctctcgctct 
gcctgccagg 
ctcgccacca 
agccaggatg 
tgggattaca 
tcatacctct 
gtgtnaacat 
gtgtattatg 
gttctattca 
agtgcctact 
tcatggaact 
catgtgttac 
cgaggtggct 
agcgggggcg 
gatagaggcc 
ggaatggagt 
accatggcaa 
ggatttatgt 
tttggaaaat 
gtcatagtct 
cagttcgggt 
ctatgaagtt 
gaatttctgc 
atgtctcttt 
cccaaaaggc 
aataagaggg 
acttgttaag 
gttatttagc 
tcttgatgat 
tttgaaagcc 
aaaatttaac 
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agtgattaca 

tctgcttttt 

cttcccctgt 

ggatgtccct 

ctcattggac 

tgctatttca 

ctcatctcaa 

ttgtatatac 

tttttttttt 

cctcatcttc 

aggcacacac 

aggctggtct 

gattatagga 

tgttcattta 

cctgtaatct 

gaccagcctg 

qrar ggtggc 

gqgcccagga 

ucugagtgag 

tattacctct 

:q.*aattatg 

tact taagaa 

qjiaaaaatca 

ct t agccaga 

attgatattt 

catatcctca 

tctcagcagt 

tggccaacat 

ggct cacgcc 

agttcgagac 

tagccaggca 

tcccttgaac 

ctggtcaaca 

ggtggcgggt 

ttaaacccag 

gcaacagagt 

ggagattgtg 

tttggttcag 

ctttctctcc 

gctttaattt 

ttttttgaga 

gttgcaacct 

ttacaggcac 

ctgagatgga 

tcaaacctct 

actacaggta 

cactatgttg 

tcccaaagtg 

ttcaccatgt 

cctcccaaag 

aaaagagcct 

atcatctaag 

aatgtagtgg 

tgataaatcc 

agagatttct 

ttgttgccga 

ggttcaagca 

cacgcctggc 

tcgaactcct 

gcgtgagcca 



ttagagatga 
gaactttctt 
ggctttctgt 
tgcxggcctc 
tatcttttat 
tggctgtggc 
tattatctct 
cactcactac 
gagatagggt 
ttgggctcaa 
caccatactt 
caagctcctg 
gtgagccact 
tttacatact 
caatactttg 
ggcaacaaag 
acatgcctgt 
ggttgaggcg 
accatgtgtc 
atgtgggcag 
cctagcacat 
tttcatttgg 
ttttttaaac 
ttacttccga 
cttagcactt 
ttctacaaat 
ttgggaggcc 
ggtaaagcct 
tataatccca 
catcctggct 
tggtggcacg 
ccgggaggcg 
gagggagact 
acgagtacct 
gaggtggagt 
gagactctgt 
gtctgtgatt 
aataccacct 
ttccttattg 
gagttttgaa 
cagtcttgct 
ctgcctccca 
gtgtcaccac 
gtcttgctct 
gtctcctggg 
cgtgccacca 
accaggctgg 
ctgggattat 
tggttagact 
tgctgggatt 
tctgagatta 
agacagtgta 
caatcccttg 
cttacatgtc 
tttttttttt 
ggctggagtg 
attctcctgc 
taattttgta 
gacctcaggt 
ccgcgcccag 



gaagaaagag 
gccctatgca 
tgcatttgga 
tatcacctta 
tcttttgctg 
aagccctgcc 
tcagagaggg 
cacttctttc 
cttgctctgt 
atgatcctct 
ggcttattat 
ccgcaagcaa 
actcctggcc 
tgtttatagc 
ggaggctggg 
tgagaccctg 
ggtcccagct 
acggtgagcc 
taaaaagtaa 
ggagtttgtc 
ggtaagtacc 
gattatctga 
ttggttgccc 
ggatactcac 
tcaagctaat 
gtagaaattg 
aaggcgagcg 
tgcctctatt 
gcacgttggg 
aacacagtga 
cgcttgtagt 
gaggttgcaa 
ctgtctcaaa 
gtaatcccag 
Ctgcagcggg 
cttaaaaaaa 
tgttaggaat 
tgacaatggt 
agggcagctg 
accttgataa 
ctatggccca 
ggttcaagca 
gcccagctaa 
gtcacccagg 
ttcaagcaat 
tccctagttc 
tctcgaactc 
tggcacacgc 
ggtctcaaac 
acaggcgtga 
tgagaagggc 
acaagaagga 
tgcttcgata 
attttaagga 
tttttttttt 
caatggcgtg 
ctcagcctcc 
tttttagtag 
gatcctcccg 
cccagagatt 



gtgcctttta 
tacgttattg 
atgaaatcta 
ctttgaacca 
aagtttcttc 
atggctttca 
accttcccaa 
ttttcttttc 
tgcccaggct 
cacctcagcc 
tttacttttt 
tccacatctc 
tattttctta 
ttatttctca 
ttggagaatt 
tctataaaaa 
acttgggagg 
atgattgtgc 
ataaaaatag 
tatactattt 
ccttaaatat 
gtggtaagat 
tttgccacac 
agaggccatt 
gcaattctta 
aagtctgggc 
gatcactgag 
aaaaatacaa 
aggccaaggc 
aaccccatct 
cccagctatc 
tgagctgaga 
aaaaaaaaaa 
ctactaggga 
ctgataatgc 
aaaaaaagaa 
cacacagcag 
ttgtttacag 
gaaagaattt 
tagagcacag 
ggctggagtg 
attctgcctc 
ttttctgttt 
ctggagtgca 
tcttctgcct 
atttttgtat 
ctgatctcag 
ctatttttgt 
ttctgacctc 
gccaccgtgc 
aagcaagata 
attgtaaaat 
cattggtggg 
gcttagactg 
tttttttttt 
atctcggctc 
cgagtagctg 
agacggggtt 
cctcagccac 
tctaaacaga 



cttttcaata 

cttaatcatc 

gcctctttgc 

ctcctttcat 

acttngagtg 

tgcaaggatg 

ctccgatgat 

cttttatctt 

ggaatcacga 

tctcgagtag 

gtagagacag 

tcagcctccc 

ttcactgtct 

gctggacatg 

ggttgagccc 

attgt ttaaa 

cagaggtggg 

cactgcactc 

tttctctttc 

ggcactatat 

ttattgactg 

tacggattat 

tgacatagac 

ctcttctcaa 

gatgatgtat 

acagtggctc 

gacaagagtt 

caattagggc 

aggcagatca 

ctactaaaaa 

gggaggctga 

ttgcaccgct 

aaaaacaatt 

ggctgaggga 

accactacat 

agaaagaaat 

gttagtagca 

ttcggctccc 

tcatcattta 

aggaaaagac 

cagtgacacc 

agcctctcga 

ttgtttcgtt 

gtggtgcgat 

cagcctcccc 

gtttagtaga 

gtgatctact 

atttttagta 

aagtgatttg 

ccagccaaga 

acttaagaag 

gatgttatga 

agacaaaact 

actcccatca 

tttgtgacag 

accacaacct 

ggattacagc 

tctccatgtt 

ccaaagttct 

gttctaacca 



taccttttcc 
cacctcatct 
tgttacctgt 
ggactgagct 
cctctgcagt 
gttcctcctt 
ctaaaatcct 
tttttttttt 
ctcactgcag 
ctggaactgc 
ggtttcacca 
aaagtattgg 
aaaattatct 
gtgcctcaca 
aggacttcaa 
aattagctgg 
agaatcgctt 
tagcctagtg 
atgactagaa 
ttcctgattc 
aattatttaa 
atttatgtaa 
actaagtttt 
tccccaaata 
ctgtgtatat 
tcacctgtaa 
aagaccagcc 
cgggcgtggt 
cgaggtcagg 
tacaaaaaat 
ggcaggtgaa 
gaactccagc 
agccaggcgt 
ggagaatcac 
tccagcctgg 
tgaggaatgt 
actacagggc 
cttcctctgc 
ctagcctata 
tgagttttct 
atctcagctg 
gtagctgaga 
ttgttttttt 
gttggctcac 
agtagctggg 
gatggggttt 
cgtctcagtt 
gagacggggt 
cccgccccag 
ttgagttttg 
ttacattaaa 
gcacgtgccc 
gtacttaaat 
tgtagacatc 
agttttgctc 
ccacctccca 
catgcaccac 
gtggctggtc 
gaaattacag 
gatgcttttc 
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cctgtcagta 

agccagtgga 

ggtagcaaga 

gaaacttggt 

ggttttctaa 

ggggcgattc 

actctggttg 

actgagggaa 

gaccttgcct 

tgagcatgtt 

ggctggccta 

gggagttaag 

cagctagttt 

ttttatatag 

atgtat'tgct 

tttgtttttt 

agagcagaag 

gtttgcaagt 

tccttagcgt 

taaataagag 

agtctcgctc 

tgctcccagg 

cctgctacca 

ccaggccggt 

tggaatt aca 

atgcatgaaa 

atgaagaaat 

tctctgttta 

tcccatgttt 

gtggtccctt 

gaggctagga 

gtaatcagtg 

agtagtggag 

aaaggaaaat 

tagttgacac 

gttttttcgt 

catttgttaa 

cagtttaaat 

agtcagtgac 

aaggatgact 

agaacataag 

cataagcaac 

gggcaataca 

gttaaatatc 

gaggccaagg 

tgaaaccctg 

ctgaggcagg 

cactgcactc 

gagagctcaa 

aaagaaagcc 

gtatatgtgt 

ctgaatgatt 

gtagagagac 

ctactgaaga 

ttaaagggtc 

taacacaaca 

gcgcagtggc 

gaggtcagga 

acaaaaactt 

ggcaggagaa 



gaatgagaat 

attagctggt 

at tagaggga 

ggcagatttc 

catgtcaata 

aacagccagt 

gcagtaaaat 

aaggatccag 

caagaaataa 

caaattatta 

cctcggatga 

gcgtactctg 

atatagacta 

gagctggtct 

agattcttac 

aacgtttgcc 

aaaagtccag 

taaatgctct 

tttggggtat 

cattgtggtg 

tgttgcccag 

t tcaagcaat 

tgcctggctg 

cttgaactct 

ggcatgaacc 

tgtacattca 

gggtgcaagg 

taagtgccac 

aagcagcatg 

cttaacatct 

ccactgaagg 

tgtagatcaa 

gcttgctttt 

gttttatttt 

tggaagacag 

acatgggaat 

gagcactgag 

tattttcttt 

attatgcagc 

tcgttttgtg 

gggtttcttt 

tagaaaatcg 

gggaaaccat 

aaagttcagg 

cgggtgaatc 

tcttagccgg 

agaatcgctt 

cagcctgggc 

tttgagtaga 

gtagagatat 

ggggtgaaaa 

gctggaacat 

ttagtaatac 

taaattagtc 

aaattaccta 

aaattcagca 

tcatgcctgt 

gttcaagacc 

agccaggcat 

ttgcctgaac 



gaat tggagg 

aatgttgata 

aggtcggatt 

atgtgtaaat 

gagtgactct 

tgagcct tea 

ttcattaaac 

gttttgtatt 

tctaccaaca 

aataaaaaag 

ttctcagcat 

gcttggatag 

gagaactaga 

ggaaggtttg 

ccaagagcat 

acaaactaac 

aactctgaaa 

gttatgtaag 

cacacaaaaa 

gtggtggtga 

gttggagtgc 

tcttctgcct 

atttttatta 

taacctcagg 

accatggcca 

attttgtctt 

aaatactgat 

cctcatgtaa 

gcacatttat 

aacaattgee 

atatacatgc 

aagctcaaat 

ttaatagtta 

gaccatctaa 

ggaatgacat 

gaaattctta 

tatgtgcatc 

aggeccagga 

aggctcagta 

taaactaaaa 

gectttgaag 

acaaactaaa 

ggaaaccaaa 

ccaggtgcag 

acttgaggtc 

gtgtggtggc 

gaaccaggga 

gaegagegaa 

agttgtagga 

ttagagagat 

cgcatgtgtc 

agggctaaga 

acaaggcatt 

ctagagtaca 

acaactgeat 

cttcacagtg 

aatcccagca 

agectggcta 

ggtggccggc 

ccaggaggtg 



tgggagagac 

ggagaagaaa 

tatgatatgt 

tgggaaggta 

gcaggggggc 

tgcagagcat 

caatatttaa 

ttttatgaat 

attaacttgt 

taagctgtgt 

gtgattacag 

agtagagctc 

atgtagcagc 

aaaacataac 

tatcctggtt 

actagatgtt 

caccttttca 

caatataatc 
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tagtggtttt 
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ttttagtaga 

tgaatcaccc 

gecaaataag 

tgtttactag 

gaggtaaatc 

gtgaggttta 

gttctcttac 

tggtagtagc 

attcaagttt 

gtttccttcc 

attaagtaca 

tatgaaagta 

gttaatattc 

tccaaataag 

tcgatccatc 

agagctagct 

tgatcaaccg 

agtattattt 

gattaactgc 

tgaaacaact 

cagagcccag 

tggctcacgc 

aggagttcaa 

aggcacctgt 

ggcggaggtt 

accccatttc 

taaggtagca 

tcccatggat 

caggtagaga 

aaagttcatg 

gggtagtgtc 

agcacctgaa 

gecaaaacaa 

taaagttaga 

ctttgggagg 

acatggtgca 

acctgtgatc 

aaggttgcag 



tggcatgagg 

aagattcaaa 

ccaaggttga 

gattgagttt 

ctgacgagag 

ttaacactgt 

acccttaggt 

tcagttattg 

tttaaagcaa 

atttcattca 

atgtgggctt 

tttgaaactc 

atactctgtc 

aaatgtgttg 

agggtttggt 

agttctttca 

aaagtttttc 
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ttggttttgt 

tcaagtgagg 
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tttgagttgg 
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accatgttgg 
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atgtaaaatt 

ctcacaagct 
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tttctttagg 
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tcttcagtca 
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ttttacttca 

ggaggtgaaa 

aggtatccag 

ggtggcccag 

gtgegtaage 

ccattatcac 

ctaaaatgat 

ctacaaaggt 
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accttcttat 

attggacaat 

gaacgaaaga 

agcactttgg 

gecaacatgg 

atttgggagg 

gagatcacac 

tcaaagttca 

agctgcccag 

ggagtgatct 

aattagtagg 

tetggecaga 

ttatgectta 
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ctctttacag 
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gggaggctga 
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gcactctggt 
cgtcaatcac 
tctatagaaa 
gtggctattt 
aatcccagca 
gtgaaacccc 
atagtcccag 
ttgaggttgc 
actctgtctc 
gcaaaatagt 
ggccgggcgt 
atcacctgag 
taaaaataca 
ggctgaggca 
accagtgcac 
aaaacagtag 
ttttgcaggt 
ccaaaacccc 
agtcctaggc 
caacctccaa 
acaccattac 
cttccgtgat 
gtgtgtcttt 
tctattgaaa 
tctggatgcc 
tgttcactca 
cttctggctg 
tttgtaataa 
tttcagttat 
tttttccctt 
gtcactgttg 
acctgtaggc 
tggtattgtt 
gagtgtttta 
gaactgatac 
ggtgccagag 
aggcagcttt 
tatgcagtct 
tttcagtgac 
tcctgagatg 
ggtttacttt 
tcccttcaga 
gtggcacctt 
gagcaggatg 
gtgagctgga 
ctcactcact 
gactgcccag 
catgattgag 
aaaaaaaata 
aagtcttgaa 
gtttcatcga 
ctgttattct 
ttgattttaa 
gggcatttat 
ctaaaaaccc 
aaacccaagg 
aatatttcca 
agaataaaat 
ctacaagact 
tttgtatata 



ctgggcaaaa 
aaattaccaa 
cagacccaga 
taaatattaa 
ctttgggagg 
ttctctacta 
ctatatggga 
agtaagccga 
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tttaaaagaa 
ggtagctcac 
atcggtagtt 
aaattagctg 
ggagaattgc 
tccagcttgg 
actcgaagaa 
gactacttta 
aaaagaaaaa 
ctgcgaagga 
ctcgacattt 
tgaattgctt 
atttgtccgc 
accatccctc 
attggcaata 
agtccttgtt 
tattttaaca 
tcttaaactc 
taggtcactt 
gactcaaaac 
ggttggattc 
tttccaggag 
ataattgatg 
tacattacta 
gcctactgga 
ttgatgcaca 
gtggtttgct 
gagtggcagc 
tcttgacagc 
ttcctccttc 
cactatactt 
gagcattagg 
atacatgtgg 
gaacaaagca 
actggggaga 
aagtgtgcag 
gcacttctgc 
ttggtgagca 
aggaaagaaa 
tatatatata 
tgaacagaga 
cagaagaagc 
tttaaagctg 
aatatatata 
ggctcttctg 
agaacatgag 
tttttattag 
tccacttcac 
gtggttttgg 
ctaggctgtt 
cattttgtgc 



agagcaaaac 
gcatgacatg 
tgtgagaaag 
aaatatgttc 
ccaaggtggg 
aaaatacaaa 
ggctgaggca 
gattgtgcca 
aaaaaaaagt 
ccaaatggaa 
gtctataatc 
caaggccagc 
ggcgtggtgg 
ttgaacccgg 
gccacaagag 
ctagctgagt 
gttcctcatg 
ccccttctaa 
atactcattc 
gcctatagga 
catgtgcaca 
agtgctgtga 
ttgaatatgc 
tttttcattc 
atatgcccct 
aggaaaatta 
tggtatatag 
gttagagaaa 
ttgagataaa 
ttcaacacag 
agaatgggag 
cacatgatgt 
gaaaattatt 
atagacaggg 
ctcgtagtgg 
ttgtccttcc 
gtggtgctag 
ggcattaatt 
ctctgatggc 
aaaaccattc 
atttgcccct 
aaagaaagaa 
gttcttccca 
gagaaacatt 
ttggtcgtct 
tcagttggct 
cactccattg 
gttctcttag 
tgtataaata 
atttattcca 
aaaaaggctt 
aagtattcat 
tatgaattct 
ttgaaatata 
ccactactgg 
tcatctatgc 
attgcttcaa 
gcaactcttg 
Caaactagtg 
ttctgagcta 



tcaggctcaa 
aagttgacct 
atgatgaatt 
aagtggccag 
taggagttca 
aaaattagct 
caagaatcac 
cttgtactcc 
taaagaaaac 
tttcttaaaa 
ccagcacttt 
ctgaccaaca 
cgcattgcct 
gaggcagagg 
tgaaactccg 
ttttctttac 
tcctcattag 
tcctcattcc 
tctttatcct 
tgtacttgga 
tgtcccatgc 
ctacaggagg 
tctagggtta 
taatatctat 

tgggtaagtt 

caatatttta 
taaacactaa 
tgcaccttac 
ggaaatctgc 
ccaatgaaaa 
acaatcctag 
tcacacagtg 
agttttccaa 
accacatcct 
taactcatcc 
aaagcaggtg 
cagcttcagc 
tggaaggaaa 
agtatatagt 
tctcccctgc 
ttggaattct 
agaaatagcg 
aattatactt 
tgactttgac 
ttcttctcct 
tctgcatcgg 
accacgtggc 
atgttactgc 
tataattatt 
ttgcaatatt 
tgtgtaagtt 
gtacttaaac 
atttaaaatt 
ttgatctttc 
actttgcctt 
tgtgattaat 
tctttaacag 
ctgcctctgc 
ctttcagtta 
gagatgccaa 



aaaaaaaaaa 
ataaccagga 
tagcagacaa 
gtgcagtggc 
agaccagctt 
gggcatggtg 
ttgaacccgg 
agcctggaca 
aagagtataa 
taaaaaatac 
gtgggggctg 
tggagaaacc 
gtaatcccag 
ttgcggtgag 
tctcaaaaaa 
tttaggcagt 
tagatcagag 
atgattttat 
gtgttgatac 
cattcagcat 
cacaataccg 
gagtcagtga 
attcctagaa 
tgccaacatg 
acgtaacctc 
cctcacaaaa 
gtgttggtgt 
cattttcttt 
ttgtgaaaaa 
cagcactata 
acttccacca 
agagtcttaa 
tggcaataac 
ctgggaagca 
ctaatcagca 
agtcagcccc 
ggaacagggt 
ctgacaagtc 
tttcacattt 
taacagaagg 
gcactccagt 
atgactccac 
tttttttttt 
tgcctccccc 
ttctttagga 
gatcacacag 
gccagcgctt 
ttttgctcag 
aatcactttt 
tgattgtata 
tttggtacta 
catattatat 
gtgtcaactt 
caaatatttt 
gtgtttgaag 
tcattttgtt 
aaaagcaata 
atgttttgga 
agataaattc 
gtagttgtaa 



gaatgtctga 
gaaaactcaa 
agaccatcaa 
tcatgcctgt 
ggccaatatg 
gcaggtgcct 
gaggtggagg 
acagagtgag 
tgagaaaaat 
cagaaatggg 
aggcaggcag 
tcatctctac 
ctacttggga 
ctgagattgc 
aaaacaaaaa 
aagtgtgacc 
aaattcgaca 
gaatgcatga 
ctctctgctt 
aaactacctc 
gggaccttgt 
atgtctgcat 
gtagaattac 
ggaaagcaag 
tttaagcttc 
ttgtagtcag 
ccatccttaa 
tcttttcttt 
taagagaact 
tttctgatct 
taatgcagtt 
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ccatttatga 
gataagcata 
ttgtaaagca 
accgagagcc 
gagagttaat 
atgggtcaag 
taattcctcc 
gtgtgaatct 
tacttaactt 
ttttgcccct 
taaataaggt 
attctttgct 
tagtaagaga 
ccatcagcag 
cctcaatgca 
actttgcaaa 
gtccttgaga 
gaggcacact 
tgtaccacct 
ttaattgtgt 
tctgctttca 
catttgcttt 
tgtatggcat 
cttttaacaa 
taaaggttat 
ataacaattt 
taatcatttc 
actgcttata 
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aagagaatag 

acagaagcgg 

ccttgtactc 

caggctctgt 

ataatcagtt 

aaagaaaaga 

gtaagatttg 

agggagtaac 

aatgtaagta 

gtcattctga 

taagcacctg 

ttcaggagtg 

tgcccacaca 

gttcttaatg 

tttgggaggc 

catggtgaaa 

ctgtaatccc 

gttgtgcaat 

tgtctccaaa 

taattacagc 

ccagcctgga 

tggtgatggg 

cccaggaagc 

agagcaagac 

aatcccagca 

cctggctaac 

tggtgggcac 

aggaggcgga 

gtgagactcc 

gtcccagcac 

gtccggccaa 

gtggtggcgg 

ggttgcagta 

aaaaaaaaag 

ggatgtggca 

ctgagatagc 

catccttgca 

ttttttgaga 

ctcaccacaa 

ctgggattac 

gtttctccat 

ggcctcccaa 

ttttgaaacc 

tgccatactg 

gatttctgga 

caaaatggta 

tactgaggcc 

ggcctgggcc 

accgctttta 

agaagtttgg 

ctaggcaagg 

aatagttacc 

cttgggcagg 

tggagatggc 

tgagggcatt 

tgggtattcc 

aactgcctgc 

tgcctaaagg 

gtcagactgg 

gagctgggat 



cagcaaattt 

agtgtggccc 

attttaaagt 

gcattgtgcc 

gaacaccctt 

tattagagag 

atgaaagtaa 

tttttataac 

gtttacagta 

attgtaacaa 

atgaagtgac 

gggtttatgt 

ccagttgatt 

ttaagaattt 

cgagacaggc 

ccctgtcttt 

agctacgtgg 

gagccgagac 

aataaaaata 

attttggaag 

caacatggtg 

cacctataat 

agagattgca 

tctgtctcaa 

ctttgggagg 

atggtgaaac 

ctgtagtccc 

gcttgcagta 

gtctcaaaaa 

tttgggagac 

tatggcgaaa 

gcacctgggg 

agccaagatc 

aattttgcat 

cttacaaaat 

aggtaccttg 

ttggactaca 

tggagtttcg 

cctccacctc 

aggcatgcgc 

gttggtcagg 

agtgctggga 

agtctgaagt 

ccctaatgcc 

gaataatttt 

tcctaaccta 

caggaagggg 

agtgcctttt 

gcaattgtaa 

cgggagagat 

tctctggcct 

cactgaggcc 

actgggcgtg 

cagtgatgtc 

catgttcagg 

tgccttagta 

tctggcacat 

tgacagtgca 

cccagtctgt 

ttaccaagaa 



gagactcggc 

gaaattatta 

tggaatttga 

cacaaaataa 

catctttatc 

aaagtggtac 

aaagcaaatg 

tttttctact 

ctggaggttt 

agtacaaact 

tgacctctct 

ttctacacag 

ggacctgggt 

tggggccggg 

ggatcacttg 

actaaaaata 

gtggctgaga 

cgtgtcactg 

agaaaaagaa 

gcccaagatg 

aaactccatc 

cctagctcct 

gtgagccaag 

aaaaaaaaga 

ccaaggcagg 

cctgtctcta 

agctactagg 

agccaagatc 

aaaaaagaat 

caaagtgggc 

ccctgtctct 

aggctgaggc 

gtgccactgc 

ggggaaggag 

caggagccag 

ataaccctga 

ttaatctgtc 

ctcttgttgc 

ccaggttcaa 

caccacacct 

ctggtctcga 

ttacaggcgt 

gagttttttt 

taatgattat 

tctttagtaa 

atggagctaa 

agaagtccct 

catgcttctc 

taaacccaga 

aatttttaca 

tgtaaaaccc 

ctctccgggt 

gtgcagagta 

caataaagga 

gagggttgct 

actttatgta 

tcagaatgtc 

tctccttccc 

gggcaaggag 

gcaaatgaga 



tacttttttc 

gccagactca 

ttcctccaac 

gattccctgg 

atgttgttga 

ctttgtaact 

tcagccaaat 

tggatttcaa 

gactagttca 

tctttgctgt 

ccagtgacag 

tgaccttttc 

tgaactcctg 

cacggtggct 

aggtcagggg 

caaaaactag 

caggggaatc 

cattccagcc 

ttttgggcta 

ggcagatcac 
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atcacatctc 

atttggccag 

cagatcacga 

ctaaaaatac 
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tttggccggg 

ggattacctg 

tactaaaaaa 

agggagaaat 

actccagagc 

agatactgtt 

cactgcatgg 

agacatcctt 

agttatcctt 

ccaggctgga 

gtgattctgc 

ggctaatttt 
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aagccatggt 

aattacgtga 

gtattctcag 

acttcactta 

aagacacccc 

ggcttgtgag 

agatccttcc 

aatagaaagc 

aaatttgtaa 

ctcaaggtta 

gaacattgag 

ggagcggtga 

cactggaggg 

gcccactggc 
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cctagagagg 

gacgaggatt 
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ttgaggacag 
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tgcggtggca 
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ccagcttcag ctgactcctg tatattgact gtgccctcag actcatccgt aagtgacccc 160740 

aggctggcct ctcccacatc acagtaagaa ttccacacac catacaactt ggaaagaggc 160800 

tccagctgaa ggaagcccca cacttctttc aagtttttct tagtcctctc ttcttggcaa 160860 

agagtacctt ttgtttcttc taattatgta actattggtt tagtaaatat tcacccattc 160920 

agtcaccctg taagtggcag gcactgttta cagggacaca ggaaggaata aaaacttgca 160980 

ggcaccttgg agcttgcatt ctattgaaga ggtaatggaa gttgggatag cagctaaact 161040 

atgctggtat tggccaggcg cagtggctca cacctgtaat cccagcactt tggaggccaa 161100 

ggcgggcaga tcatgaagtc aggagatcga gaccatcctg gctaacatgg tgaaaccccg 161160 

tctccaccaa aagtaaaaaa aaaaattagc caggtgtggt ggcgggcgcc tgtagtccca 161220 

gctacttggg aggctgaggc aggagaatgg tgtgaaccca ggaggcgaag attgcagtga 161280 

gccgagatgg caccactgca ctccagcctg ggtgacagag cgagactctg tctcagaaaa 16134 0 

aaaaaatatg ctggtagttt tgattcaaga tggcctttgg agcccatgat ttaggtctcg 161400 

tacccaccaa ggtctactgg aaaacatcag gctctcctgc tatagaccca tagggagagc 161460 

tgcagccgag agggggagct gaagagaagt gccccttctg tgtcctgtca gcctcatcct 161520 

tccgcaagga ccagttgctg tgccactcca ttcacttgct gcaagactgg aggtttttcc 161580 

tcaggtgttg agcacctggt ttacaagatg tcagcatctt gatgcctgag accatcaagg 161640 

caagtccctg aacagggctt accttagagt aaggcttaga agaggccgta aagtcagtct 161700 

cagctccgtg gctctgcaga gctttgggac atgtgaattc ttaaaaacaa gactattgta 161760 

cagttactat atgcatgcag tataaaatta taaccttgga aaatcctagc tagctgttga 161820 

gctaattcca taaagtaatc agctcctgag ttctgcagtg gtaataataa tcagcataat 161880 

gagtaaacac tgtgtgtgcc aggcagcgtc tcatttgatc cttgtgataa tcttgtaagt 161940 

actgattttc tcccctcttt aaacaaagtt tttttttttt ttttagagag ggtctcacta 162000 

tgttgcccag gctagtcttg aattc 162025 

<210> 36 

<211> 162025 

<212> DNA 

<213> Homo Sapien 

<220> 

<221> mutation 
<222> 156,277 

<223> Nucleotide Base Change: T to C 



<400> 36 

gaattcctat ttcaaaagaa acaaatgggc caagtatggt ggctcatacc tgtaatccca 60 

gcactttggg aggccgaggt gagtgggtca cttgaggtca ggagttccag gccagtctgg 120 

ccaacatggt gaaacactgt ctctactaaa aatacaaaaa ttagccgggc gtggtggcgg 180 

gcacctgtaa tcccagctac tcaggaggct gaggcaggag aattgcttga acctgggaga 24 0 

tggaggttgc agtgagccga gatcgcgcca ctgctctcca gcctgggtgg cagagtgaga 300 

ctctgtctca aaaagaaaca aagaaataaa tgaaacaatt ttgttcacat atatttcaca 360 

aatttgaaat gttaaaggta ttatggtcac tgatatcctg tttcattctt tatataatca 420 

ttaagtttga aatgtatact tgcactacta acacagtagt taatcttagt cctacaagtt 4 80 

actgctttta cacaatatat tttcgtaata tgtatgcact ggtgtttatg tacgtgttta 540 

tgtttatatc tgttaaaatt agcagtttcc atctttttct attttgtacc atcacatcag 600 

ttcagaagga ttgacagagc aaaatgattt gatgaagtat aaaagtcaca tggtgagtgg 660 

cataaataca actctgaaca attaggaggc tcactattga ctggaactaa actgcaagcc 720 

agaaagacac atatcctata tgtcaagaga tgtaccaccc aggcagttaa agaagggaag 780 

tacacataga aagcacaatg gtgaataatt aaaaaattgg aatttatcag acactggatt 84 0 

catttgctcc taaagtcaga gtcctctatt gtttttttgt ttttgtgggt ttctttttaa 900 

atttttttat tttttgtaga gtcggagtct cactgtgtta cccgggctgg tctagaactc 960 

ctggcctcaa acaaacctcc tgcctcagct tcccaaagca ttgggattac agacatgagc 1020 

cactgagccc agcccagacg ctttagcatt tatgaagctt ctgaaatagt tgtagaaacc 1080 

gcataagctt tccatgtcac tttcaaagtt tgatggtctc tttagtaaac caaccaagtt 1140 

attcctcaag ggcaaaataa catttctcag tgcaaaactg atgcacttca ttaccaaaag 1200 

gaaaagacca caactataga ggcgtcattg aaagctgcac tcttcagagg ccaaaaaaaa 1260 

aggtacaaac acatactaat ggaacattct ttagaagagc cccaaagtta atgataaaca 1320 

ttttcatcaa agagaaaaga gaacaaggtg ttagcaaatt cctctatcaa ataacactaa 1380 

acatcaagga acatcaatgg catgccatgt ggaagaggaa gtgctagctc atgtacaaac 1440 

cagtagataa tttcaacttg ctgccgaatg aaacctcttt gcaaggtatg aatcagcact 1500 
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tctcatgttt gttttgcttt gttttgtttt gtttttagag acaggccctt gctctgtcac 1560 
acaggctgga gtgcagtggc acgatcagag ctcactgcaa cctgaaactc ctgggcccaa 1620 
gggjtcctcc tgccttagcc tcccaagtag ctgggactac aggcccacca tgcccagcca 1680 
attctttaaa ttttctatag agatgggatc tcactagcac ctttcatgtt tgatgttcat 1740 
atacddcgac caaggtacaa tgtggaaaag ggtctcaggg atctaaagtg aaggaggacc 1800 
agaaagaaaa ggggttgcta catagagtag aagaagtcgc acttcaCgcc agtctacaac 1860 
actgccgttt tcctcagagc agagttgatg atctaaatca ggggtcccca acccccagtc 1920 

catagcctgt taggaaccgg gccacacagc aggaggtgag caataggcaa gcgagcatta 1980 

ccacctgggc ttcacctccc gtcagatcag tgatgtcatt agattctcat aggaccatga 2040 

accctattgt gaactgagca tgcaagggat gtaggttttc cgctctttat gagactctaa 2100 

tgccggaaga tctgtcactg tcttccatca ccctgagatg ggaacatcta gttgcaggaa 2160 

aactjcctca gggctcccat tgattctata ttacagtgag ttgtatcatt atttcattct 2220 

at at tdcaat gtaataataa tagaaataaa ggcacaatag gccaggcgtg gtggctcaca 2280 

cctnt^tcc cagcacttcg ggaggccaag gcaggcggat cacgaggtca ggagatcgag 2340 

accatcrtgg ctaaaacggt gaaaccccgt ctactaaaaa ttcaaaaaaa aattagccgg 24 00 

gtqto-jrjgt gggcacctgt agtcccagct actcgagagg ctgaggcagg agaatggtgt 24 60 

gaa:criigj ggcagagctt gaggtaagcc gagatcacgc cactgcactc cagcctgggc 2520 

gacn-nrrjj tactctgtct caaaaaaaaa aaaaaaaaaa aaagaaataa agtgaacaat 2580 

aaatqt.iutg cggctgaatc attccaaaac aatcccccca ccccagttca cggaaaaatt 2640 

ctcccu-jaa accagtccct ggtgccaaaa aggttgggga ccgctaatct aaataatcta 2700 

atcttoittc aatgctaaaa aatgaataaa ctttttttta aatacacggt ctcactttgt 2760 

tgcccj j jrt ggagtacggt ggcatgatca cagctcactg tagcctcaat cacccaggcc 2820 

ccanmv;: tcccacctaa acttcctgag tagctgggac tacaggcacg caccaccatg 2880 

cccigrtaar ttttaaattt tttatagaga tgggggtctc accatgttgc ccagactggt 2940 

ctCrtdj:cct gggctcaagt gatcctccct caaactcctg gacccaagtg atcctccttc 3000 

cttygrvtcc caaagtgctg ggattacaag catgagccac tgtacccagc tggataaaca 3060 

ttttj.inrcg cactacagtc atggacaatc aggcttttca acatgcagta tggacagtga 3120 

gtcccu i iqt ctgcttttcc atactgaaat acatgtgata ctaaggagaa aggtgctcgc 3180 

aaggat.it tt aaaatgaaga atatttaaaa tgaggaaaaa actgtttctt catgactttg 3240 

ataaggrtga taaagaccat ttctgtgatc tcaggtgatt cactcaagta gtatatttca 3300 

gtaaccatta tctggaacag cctgaatctt aaccaaaata ccacgatttt ttaatgctgt 3360 

tatgatarct tgatgatatg accaaactgc aatgtaggca gctaaatctc cacgagtttg 3420 

actcccccga gagttgacag ttttcttcac aaattaaaga aatatatttt ttgatacatg 3480 

attggcatat ttaaaaacta cactgaaatg ctgcaaaatg atataaagaa acattttcca 3540 

gaatcaaatg caatcaaaga gtggattagg aatctactca ccattatcaa ctaaatagaa 3600 

acacug3ac tgggtgtggt ggctcacatc tgtaatctca gcactttggg aggccaaggc 3660 

aggtgg^ttg cttgaggcca ggagctcaag accagcctga gcaacatagc aaaactctgt 3720 

ctctacaaaa aaaaaaaaaa attaaccagg catggtggca gatgcttgta atcccagcta 3780 

ctctggaagc tgaagtagga ggactgcttg agcccaggag atcaagactg cagtgagccg 3 84 0 

tggtcatgct gcgccacagc ctgagtgaca gagagagacc ctgtctcaaa aacaaaaaca 3 900 

aacaaaaaac acttaacctt cctgtttttt gctgttgttg ttgttgtttg tttgttttga 3960 

gatggagtct cactctgttg cccaggctgg agtgcagtgg cgtgatcttg gctcactgca 4020 

agctctgcct cccgggttca cgccattctc ctgcctcagc ctcccgagta gctgggacta 4080 

taggcgcccg ccaccacgcc cggctacttt tttgcatttt tagtagagat ggggtttcac 4140 

cgtgttagcc aggatggtct tgatctcctg acctcgtgat ccacctgcct cggcctccca 4200 

aagtgctggg attacaggca tgagccaccg cacccggcca acctttctgt tttttagttt 4260 

gatatgcttg ttaactcagc agctgaaaga atgctgaaag tggccttcag taaaaaaatt 4320 

tcactagaat ctctacatcc atatttaatc tgaatgcata tccagattga tcagttagag 4 380 

caaaaacact catcatcatt cctgatgacc tctaattctg gtttcggctt tctatttcaa 4440 

tggaaacaga ataaggaaag aaatggaagg gctctggaaa tttgtcctgg gctatagata 4 500 

ctatcaaaga tcaccaacaa taagatctct cctataaata taaaacaagt ataattaatt 4 560 

ttttaattat ttttttctct tcagaggatt ttatttcaag ataaaacata acttctaccc 4620 

atactattga ttccaaaggt tagaaaaagt gtttttcctc atcttatcct tcaaagaggt 4680 

cacagcaatg caaacatcta taaaatgcct ctgcataatt gtcagaagct atagtccaga 4 74 0 

aatcattgaa aatgcttttc cattttaagc ttaggtgagg tgtcttagga aacctctatg 4 800 

acaacttact ctatttattg ggaggtaaac tcccagactc tcccagggtc tcctgtattg 4 860 

atctcatttt ttaggcttcc taatcccttg aagcacaatc gaaaaagccc tggatctctt 4920 

ttctgcacat atcatcgcgg aattcattcg gcttccagca agctgacact ccatgataca 4 980 

agcggcctcg cccttctccg gacgccagtc cttgctgcgg ttagctagga tgaggggttt 5040 

gctgggcttc agtgcaggct tctgcgggtt cccaagccgc accaggtggc ctcacaggct 5100 
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ggatgtcacc attgcacact gagctcctgg caggctgtac caatttttta attatttaat 5160 

atttattttt aaaattatgg tgaatatttt ggtattctgc tctaaaatag gcccataaat 5220 

gcacagcaga tatctettgg aacccacagc tttccactgg aagaactaag tattcttctt 5280 

ttaaagatgc tactaagtct ctgaaaagtc cagatcctct acctctttcc atcccaaact 5340 

aagacttgga atctatgaga gatctagcta acagaaatcc cagacacatc attggttctt 5400 

cccagagtgc agtcctccta aagaggctca gccctaagca ggcccctgca ccaggagggt 5460 

gggtctgaga cccacatagc acttcccaag gtgcatgctc cagagaggca ctgaaacagc 5520 

tgagcacaag cctgcaagcc tggagaactc tcacagtcag aacggagggg gcccagtggg 5580 

actaacataa agagaaaagg gaacacagag aaatggatgg caccaacaac cagcaaagcc 5640 

ttcatggcca atgaaagcat cagtgacggg gccagaaccc tcatccccaa agactcttca 5700 

ctgcctttag tgaaaaacaa tggctagaga gtgaagttat gatcatgtat agagaggtaa 5760 

agttacattt ttatattctg actctgctaa tgtgaaattc cctatctgct agactaaaag 5820 

tttcagacac cctgttcaaa tatcccatta gttgctagag acttaaaatg aacagaacgc 5880 

acattgtcag gatgactatt accaaaaaat caaaagacag caagtattgg tgaggatgta 5940 

gagaaactgg aacttttgtg cactgtttat gagaatgtaa aatggagcag ctgctgtgga 6000 

aaagagtatg caggttcctc aaagagtaaa accaagatgt ggaaacaact aaatgcccat 6060 

cagtggatga aggggtagac aatatgtggt atatacatac catggagtac tattcagcct 6120 

ctaaaaaaaa aaaaggaaat tctataacat gcaacagcat ggatgaatct tgaggacatt 6180 

ttgctaatga aataaggcag tcatagaaag acaaatactg cacgactcca cttatatgag 6240 

ataccaaaaa tagacaaatt catagaatca aagagtacaa tggaggttac ctggagctgc 6300 

agggcgggaa acgaggagtt actaatcaac gaacataacg ttgcagttaa gtaagatgaa 6360 

taagctctca agatcagctg tacaacactg tacctagagt caacaataat gtattgtaca 6420 

cttaaaaatt tgttaagggt agattaacaa atgtagtaga tccacaaatg tggttaagtg 6480 

ttcttaccac agtaaaataa aaaaagaata tcaagcccag gagttcgaga ctagcctggg 654 0 

taacatggtg aaaccctgtc tctacagaaa atacaaaaat tagccagctg tggaggtgca 6600 

ctcctaggga ggctgaggtg ggaggcttgc ttgagcccag gaggtcaagg ctgcagtgag 6660 

ccatgattgc accactgtac tccagcccag atgacagagc aagacaccac cccccccaaa 6720 

aaaagaaaaa gaatatcaaa cattttaaaa gatcagatac gcaagaacaa caacaaaaaa 6780 

gagatgaaca gagcatcgac cctcatctag tgggattctt ggtctaactg aaaaacagac 6840 

attgagagac aaacaatgac agtgatgtga tcacagcaat tacacaggta tcccctgggg 6 90 0 

actgcagaag aaaggaggaa tgcctaactt tcagaaaata gagaaagcgt caaacagttg 6960 

gtgaaagcct tccaaaacta gagagaactg cacacaccaa atcacagaaa gaagaaaagc 702 0 

cgtgggagat tctgggaccc accggctatt tttgatggct gaacaccctg ctgcaggaga 7080 

gacaggagct ggaaagcatg gtgggatgaa acctcaaaca gctttgcctg cattgcttaa 714 0 

gatgactggg cttgattaac tctagtcaat ggggacaatt caatcaaaga agaaagatgc 7200 

tcaaattcac attttagaat gattttttat ggcagtatgg ggaatagatt aaaagagagt 7260 

gaagctggag gcaagaaact tgttaagagg caactgaaac agtctagatg ataaataata 732 0 

aactgacaga gtgactagaa aaatcagaac aggctgaatc aacagatacc tagatgaaaa 7380 

taacaggact tgatcaccag ttgtatcttg gagaggaagg agttgtttcc ttgctttccc 7440 

tacgactggg aatacggaag gtttgccgtg tgtattggtt atatactggt gtgtagccaa 7500 

tcactgacaa ccatttagca gcttaaaaca caaaggctta tctcccagtt tctgtgggcc 7560 

aggaatctaa gataggctta gctggctggt tctggctcag agtttctcaa gaggttgcaa 762 0 

tcaagatgtc agctggggtt gcatcatctg aaggctcaac tggggccgga gggtccactt 7680 

ccaaggagtt cactcacctg cctgacaagg cagtgctggt tgttggcagg agatctcaat 7740 

tcattgccaa gtgagcctct ctatagcatt gctggaacat cctccccatc tggcagttgg 7800 

cttctctcag catgagtgat ctgagagaga gagcaaggag gaagccacag tgttcttcct 7860 

actcctactc ctaacactat ggacctactc ctaacactct cacttctgcc ttattccatt 7920 

agttagaaag ggaactaagc tccacctctt gaaataagaa gtgtcaaaga atttgtggat 7980 

atatttaaaa atcatcacac tgtggaagtg gatagggggt tcaattaatg ctgaacttga 804 0 

aatgcctgag acattcaaat gtccaacagg caatgaacat acccatagat ggtcatgact 8100 

ttagcaagaa tagaggaaga tcacagaatt aaggaggaat tgaaaggtaa aagaagtgga 8160 

gtcagattcc ccctgaaaag tgagccatga aaggaacttt aactattgag ttagaggtca 822 0 

gagtaggaaa tttcggtgga attcttttztt aaagaaagga accatataag catgttttga 8280 

ggtagaggga gaataaatca gtagacaggg agaggtaaaa aacataaatg ataggggata 834 0 

gttgacaaag gtcttggcag aatcccttac ccattgactt ggggccaaga gagggacact 84 00 

tctttgtttg agggataagg aaaataagaa agaatgggtg ctatttagtg tggtcctgtc 8460 

tctagggcaa acgcataggt aacaaactgt gtgtgttagg aatatagatg tgacctcaca 8520 

ttgagattct cacctcaaat ccattttgtt gttacctgta ccttcctacc ttctcttttt 8580 

gctacatgca gactgctgtt ttgtcttcct ggcctgttcc aggtttcagc attctggcat 8640 

atctgctacc ctgttcccaa acctctctag agtccatgct ccttccttgg atagtgtttg 8700 
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attgggccac gtatctaaga agtgatgcct tcagttaggc ctgagaacct cctctatgga 8 760 

aatctccatc agtgaccctg acagacttgg tatcttggag atgtcactgc tcccagcccg 8820 

tggtctagga gaatctcagc ctgggcctct agtagtatgg ataaggcgtt aaggtatcct 8880 

tgaaccagag tctgtcatat tcctcaatgt gggacagata aaacagtggt agtgctggtg 8 940 

tttctgagct agaactctgg ttt ttggtct agattctttg atgtatgacc tttcagaggt 9000 

attaaaattt gttctaatac aatgttcaat acaaatgtag ttccttttct gttaggacct 9060 

caacaaaaca tgaccaactg tagatgaaca ttaaactatg acaattcatg gaaatgaata 9120 

cagtaatacc tgcggttccc ccattttagc agtcactatg gtgacatttg gcacaaatgg 9180 

ctatttaagg gtgcttttgt taaaacctac catcttacta ggcacatgat attgaaacta 9240 

atgaaataat ggagaaactt cttaaaaact tttaatgaat aaagtgatga agtgataata 9300 

ttttagctgc tatttataaa gtgactatta caggtcaaac attcttctag ggtttttttg 9360 

ttgaagttgt cacatttaat ccttaataac ccactatgag tcaggtattc ttctctcccc 9420 

tttggacagt tggggaaatg ggggtcagag aggttaggta atttgctcag ggccacacaa 94 80 

cctgcatgta gaaaatctga gatttgtaca ggaacgtatc aaactctgaa gtccatgctt 9540 

ctattttccc atgctgcctt tctaataaaa ggtaactaat gctactggat gctgccccca 9600 

aagtgagtca ctttcacccc accctacttg attttctcca taaaactaat cacatcctga 9660 

caacttattt attgctgatc tcccccacta gattataaac tcaataaaag caagatcctt 9720 

gtctgctgaa tatcagtacc taaaacgctg tctagcacag agcaagtaat taatatttgt 9780 

tgaatgaaca aataaaggaa aaaaattcaa aggaagaaaa agccctaaaa cagatgttta 984 0 

cctaaacata cattttaaaa gaaagcatat aacaaattca ggacagaatt taaatttgat 9900 

tttttaaaga aataaccaag tgctagctgg gcacagtggc tcacacctgt aatcctagca 9960 

ctctgggagg ccgaggcagg cagatcactt gaggtcaaga gttcaagacc agcctggcca 10020 

acatggtgaa acctgtctct actaaaaata cagaaattat ccaggcatgg tggcaggtcc 10080 

ctgtaacccc agctactcag gaggctgagt caggagaatt gcttgaaccc aggaggcaga 10140 

ggttgcagtg ggccaagatt gcaccactgc actccagcct gagtaacaaa gcaagactct 10200 

gtctgaagga gaaggaaaga aagaaggaaa gaaggaaaga aggaaagaag gaaagaagga 10260 

aagaaagaaa gaaagaaaga aagaaagaaa gaaagaaaga aagaaagaaa gaaagaaaga 1032 0 

aagaaagaaa aagaaagaaa gaaagaaaga accaagtgct tatttgggac ctactatgct 10380 

atgtttttcc atgcacgcta ttttcagtaa agcagttagc aaacttgcaa gatcataaca 10440 

acaaatatat gcttctataa ctctaaaatt gtgctttaag aagttcctct ttaccagctc 10500 

atgtatgcat tagttttcta agagttacta gtaacttttt ccctggagaa tatccacagc 10560 

cagtttattt aaccaaagga ggatgcttac taacatgaag ttatcaaatg tgagcctaag 10620 

ttgggccagt tcatgttaat atactccaga acaaaaacca tcctactgtc ctctgacaat 10680 

tttacctgaa aattcatttt ccacattacc aaggagccag ggtaggagaa tatagaaaga 10740 

ccacccaaga atccttactt ctttcagcaa aatcaattca aagtaggtaa ctaaacacat 10800 

gccctaacaa tgaatagcag attgtgctca gaagaatgat ctacaacatc ttactgtgaa 10860 

ggaactactg aaatattcca ataagacttc tctccaaaat gattttattg aatttgcatt 10920 

ttaaaaaata ttttaagcct aaattttaaa aggtttgata ttggtacatg aatagacaaa 10980 

cagacatgga ctagaccaag aattaggttc aaacatatac aggaatttaa tatacgataa 11040 

atctagtatt ccaaaggaac caacaaatgg tgttcagaca gcaggatagg catcaggaaa 11100 

aacacagttg ggcaccctac cttactccta acaccaggag taactgaagg agcaccaaat 11160 

atttatttat tttaattata gttttaagtt ctagggtacg tgtgcacaac atgcaggttt 11220 

attacatagg tatacatgtg ccatgttggt gaggagcacc aaatatttaa aagaaaaaaa 112 80 

ttggccaggg gcggtggctc acacctgtaa tcccagcact ttgggaggcc aaggtgggca 11340 

gatcacctga ggtcgggagt tcgagaccag cctgagcaac atggagaaac cccatctcta 114 00 

ctaaaaacac aaaattagcc aggcatggtg gcacatgcct gtaatcccag ctacttggga 11460 

ggctgaggca ggagaatagc tttaatctgg gaggcacagg ttgcggtgag ctgagatatt 11520 

gcactccagc ctgggcaaca agagcaaaac ttcaactcaa aaaaattaat aaataaataa 11580 

aaataaagaa agaaaagaaa aaaatgaaaa tagtataatt agcagaagaa aacaccgtag 11640 

aatcctcgga ctcttaggat ggggaatgcc tataatataa aaaccctgaa gttataaaag 11700 

agaaaatcac ctacatacaa accaaatctt tctacatgcc taaaacatag cacaaacaca 11760 

gctaaataat catagctgaa tgaactggga aaacaaaact tgactcatat ccagacagag 11820 

ttaattttcc tacacataaa gagtacctat ataaacccaa caaaaaaacc accactaacc 11880 

caaaataaaa atgtgacagg taatgaacag gtagttcaca gagaatacaa atggctcttc 11940 

ggcacataag atgctcagac tgacttttac ttatttattt tttgagagac agggtctcac 12000 

gatgttgccc aggttaggct caaactcctg ggctcaaatg atagtaccag gactacaggt 12060 

gtgccccacc gcacctggct cctcaaccac ctgtattaac aggaaatgca aaataaaact 12120 

ttcaaatcta ttttacctat tagaatggca aaaatttgaa aaacttcaaa catcatcatg 12180 

ttggtgagaa tgtgaggaga ctggcactct cattttttgc tgatagcata tatatactga 12240 

tggcttctat ggaaagcaat ctggcagcgt ctatcaaatg tacaagtgca tatatccttt 12300 
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gacaaagcaa ttccactcta ggaatgtgtt ctatatggtt gtgcttcctg gggctgggaa 12360 

ctgggagcta agggacaggg gcagaagata accttctttt ccctccttcc ccgttaaaca 12420 

tgttgaattt tatacaccgt aatatattat ttttcacaaa agataatttt taagcgatat 12480 

gcctgggaat tttttttttt cttttctgag acagggtctc actctgtcat ccaggctgga 12540 

atgccatggt atgatctcag ctgactgcag cctcgacctc ctgggttcaa gcaatcctcc 12600 

cacctcagcc tcctgagtag ctgggactac aggcacgtgc catcatgcta atttttgtat 12660 

atacagggtc tcactatgtt gcccaggcta atgtcaaact cctaggctca agcaatccac 12720 

ccacctcagg ctccaaagtg ctgggattac aggcgtgagc caccgcgcct ggccctggga 12780 

atccttacaa aagaaaaaat atctactctc cccttctatt aaagtcaaaa cagagaagga 12840 

aattcaacct ataatgaaag tagagaaggg cctcaaccct gagcaacaaa cacaaaggct 12 900 

atttctgaga caggaatttg ctgaacaaaa tcgagggaag atgacaagaa tcaagactca 12 960 

cttctcggct gggcgcagtg gctcacacct gtaatcccag cactttggga ggccgaggcg 13020 

g.-icagatcac gaggtcagga gattgagacc atactggcta acacagtgaa acccagtctc 13080 

tactaaaaat acaaaaaatt agccgggcgt ggtggcaggt gcctgtagtc ccagctactt 13140 

ggj^agctga ggcaggagaa tggcgtgaac ccaggaagcg gagcttgcag tgagccgaga 13200 

tcacgccact gcactccagc ctgggtgaca gagcaagact ctgtctcaaa aaaaaaaaaa 13260 

ajg.ictcatt tctctagatc ttgagccgta ttcaaattta tctcagctta gtgagaggtt 13320 

aaa^caagga atatccttcc ctgtgggccc tgctccttac tgaaggaagg taacggatga 13380 

gtcaag-jaca ccaatggaga aaagcactaa caccattatc tgatgaacat tacgtgaaga 13440 

ag^cjta.tgaa gtgaagtgga attgctgaag aagtcagtga aagcggacat tcatttgggg 13500 

aaatg^aata taggaaatcc ataaaagtga ttaaaaagat gttagaggct gaggcggggg 13560 

gaccacaggg tcaggagatc gagaccatcc tggctaacac ggtgaaaccc catctctact 13620 

aaaaatacAa aaaattagcc aggcgtggtg gcaggcacct gtagtcccaa ctactcggga 13680 

gactgaggca ggagaatggc atgaacctgg gagacggagc ttgcagtgag ccgagatcac 1374 0 

gccactqcac tccagcctgg gtgacagagt gagactccat ctcaaaaaaa aaagttagat 13800 

acgagujata aagatccaac agacacacaa ctgctaattc tgaacagaac aaaacaaatg 13860 

gcacaggaaa agaaaactta agatataaca ccggaaaact ttcctgaaat tgagtaactg 13920 

aatctatagc ttgaaagggt ttagcatatg ccaagaaaaa tcagtagagt ccaaccagca 13 980 

caagacacac ctagcaaggc tggtgattct accaacacag agaaagaagt gggtgaccca 14 040 

taatgcggaa aaaggcagac catctgcagt cttctccaga acactggagt ctgaagacaa 14100 

aagaatgctg cctactgagc cagaagggag agaaagtgac ccaacacatc tttaccaagt 14160 

tagaatgtca cgcattattt aaaggctgca aaagccatga aagacatgaa agaacacaag 14220 

catttacaac atgaaagaac acaagcattc tcatactcaa gaatccttaa gaaaaatgta 14280 

gtcctaatcc agcccactga aagttaaatg tacttaatgt gctcattaat gggaacttca 14340 

tagcttcaaa tcagtctggt cccatctacc aacatctctc gcccggcttt cctgcaatag 14400 

tcagcacctt tccctcctcc cagtcttgtc ccctggagtc tgctctcagc atagcagagt 14460 

gaccacatca acacccaagt cagagccctc cagtgcgcac tggtctacaa agcccttccc 14520 

accccccacc ccacgtgccc tccggatcct tgtgacgtgt ctcctgcata ccctagcagc 14580 

cctggcctcc tcactgcccc tcctgtacat caggaaggcg actccttgag tcttggctct 14640 

ggccgcctcc tccacctgca gtgagttaac tcccttacct actctaggtc attgctcaaa 14700 

tgtcagcatc tcaatggggc cctccctgac taccctattt aaattctaca tactcccctt 14760 

gaccccatgg acctcactca ccctattcca cttttattct tacaatttag cacttgttct 14820 

cttctaacgt attctaagac ttactcattt attacattgt ttgccacccc ctctagtaca 14880 

taaactccag aggggcaggg atttctgtct atttattcat ttctttatcc ctaggacata 14940 

gaacagggca tagttcagag tattcaatgt tatcaatgaa tgaactagca gtagtaccag 15000 

ttccagttag gcacagaatt aaatctaaat agaattaaat ctcatggtct gggttaacta 15060 

tggatagaaa attagatata attttaagaa gcctagaaag aaaaaattaa taatgtaaaa 15120 

ataatattaa tttgataata ataacaaaaa ctctgccagg cactgtggct caaatctgca 15180 

atcccagcta ctcaggaggc tgaggtggaa ggatcacttg agaccagagt tcaagactca 15240 

gcctaggcaa cacggcaaga aactgtctct aaaaaaatta aaacttaaat ttttaaaaaa 153 00 

gaattctcaa agcgtcacaa aaactggaga ttaaggtaca ggaagtgtga agtaatatta 15360 

ctatgctaat ggtttttttt ttttttagaa aggtataacc aaaagatttc tttctcaagt 15420 

cgataaactg agaaagataa gcatatcttc caattaacag agggggagga aaagccagat 154 80 

acaacaaaat aagatataaa ttagtttcca gttgaaaaca agagtaggag ttattttgca 15540 

tcacctcacc tgtgacctcc cccagcccaa aaaacactac tgataaacag ggtagaaaag 15600 

catcatctca gataaagcag gaaaaactgc cacagtctca aaccacaaac tataagcaca 15660 

cacctggcca accctgccaa gtctgggctc agtaggagga acgtgctgag agctaggatg 15720 

taccaactta gacattctgt gggatacaga tgtccctgga agggtcacac catctcaaag 15780 

gcacctgtaa tgcccactga ttacagccac catatgtgag agagaaactc agggcactta 15840 

gagagtataa caagaacctt atgtcatctg agatgaggaa tcctcagccc tgcaaattaa 15900 
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aaccccagca ctttgggagg ccgaggcagg tggatcacaa gatcaggaga tcgagaccat 19560 

cctggctaac acggtgaaac cccgtctcta ctaaaaacac aaaaaaaaat agcaaggcat 19620 

ggtggtgggc acctgtagtc ccagctactc gggagcctga ggcaggagaa tggcatgaac 19680 

ctgggaagag gagcagtgag ccgagatcgc accaccgcac tccagcctgg gcaacagagc 19740 

aagacttcgt cccaaaaaaa aaaaaaaaaa aaaaaaaagc ctcaacaaac aaccacaaac 19800 

gtgcttgaaa caaatgaaaa aaaaatcttg gcaaagaaat aaaagatata tattttggcc 19860 

aggtgcagtg gctcacagcc tgtaatccct gcactttggg aggctgaggc aggcggatca 19920 

cctgaggtca ggagtttgag accagcctga ccaacatgga gaaaccccgt ctctactaaa 19980 

aatacaaaat tagccagtca tggtggcaca tgcctgtaat cctagctact caggaggccg 2 0040 

aggcaggaga atcgcttgaa ctcaggaggt ggaggttgcg gtgagccgag atcccgccat 2 0100 

tgcacattgc actccagcct gggcaacaag agcaaaactc catctcaaaa aaatagatac 20160 

atattttaat ggaaatttta gaattgaaaa atacagtaac caaattgaat ggaaagacaa 2 0220 

catagaatgg agggggcaga caaaataatc agtgaacttc aacagaaaat aatagaaatt 2 02 80 

acccaatatg aagaacagaa agaaaataga ctggccaaaa aataaagaag aaaaaagagg 2 0 340 

agcagcagga ggaatgatgg aaaaagagaa aggaaggaag gaagggaagg agggagggaa 2 04 00 

ggagtgaggg agaaagtctc aaagacctct gagactaaaa taaaagatct aacacttgtc 2 04 60 

atcagggtcc aggaaagaga caaagatggc acagctggaa acgtattcaa aaaataatag 20520 

ctgaaaactt cccaaatttg gcaagagaca taaacctata gattcgaaat gctgaacccc 20580 

aaataaaaag cccaataaaa tccacaccaa aatacatcat agtcaaactt ctgaaaagac 2064 0 

gaaaagagaa aacgtcttga aagcagtgag tgaaacaaca cttcatgtat aagggaaaaa 2 0700 

caattcaagt aacagatttc ttacagaaat taaggaagcc agaaggaaat gacacaatgg 2 0760 

ttttcaagtg ctgaaagaaa agaagtgtca acacaaaatt ctagattcag taaaaatatc 20820 

cttcaagaat caatgggaaa tcaagacagt ctcagataaa gcaaaataag agaatatgtt 20 88.0 

gccagcagat ctcccctaaa ggaatggcaa aaggaagatc atgcaacaga ccaaaaaatg 2 094 0 

atgaaagaag gaatccagaa acatcaagaa gaaagaaata acatagtaag caaaaataca 21000 

tgtaattaca ataaaatttc tatctcctct taagacttct aaattatatt gatggttgaa 21060 

gcaaaaatta taaccctgtc tgaagtgctt ctactaaatg tatgcagaga attataaatg 21120 

gggaaagtat aggtttctat acctcattga agtggtaaaa tgacaacact gtgaaaagtt 21180 

acatacacac acacacgtaa gtatatataa atatatgtgt gtatatgtgt gtgtatatat 21240 

atatatacat ataatgtaat acagcaacca ctaacaacac tatacaaaga gataataacc 21300 

aaaaacaatt tagataaatt gaaatggaat tctaaaaaat attcaaatac tctacaggaa 21360 

gacaagacaa aaagagaaaa aaagaggagg acaaactaaa ttttttaaaa acataaataa 21420 

aatggtagac ttaagcccta acttatcaat aattacataa atgtaaatga tctaattata 214 80 

tcaattaaaa gacagagata gcagagttaa tttaaaaaca tagctataag aaacctgctt 21540 

tgggctgagt gcagtgactc acacttgtaa tcccagcact tcgggaggcc aaggcgggtg 21600 

gatcacctga ggtcaggagt tccagaccag cctggacaac atggtaatac cccatctcta 21660 

ctaaaaatac aaaaaaatta gccaggcatg gtggcacacg cctgtagtcc caactactca 21720 

ggaggctgcg acacaagaac tgcttgaacc cgggcagcag aggtagcagt gggccaagat 21780 

tgcgccactc cagcctgaac gacagagtga gactccacct cagttgaaaa acaaaaaaga 2184 0 

aacctgcttt aaatatacca acatatgttg gttgaaatta aaagaataaa atatatcatg 21900 

aaaacattaa tcaaaagaaa ggagtggcta tattaataac ataaaataga cttcagagaa 21960 

aagaaaattt caagagacag gaataaaagg atcaagaaaa gatcctgaaa gaaaagcagg 22 020 

caaatcaatc attctgcttg gagattcaac accctctctt aacaactgat agaacaacta 22080 

gacaaaaaaa tcagcatgga gttgagaaga acttaacacc actgaacaac aggatctaat 22140 

agacatttac ggaacactct acccaacaat agcaaaataa acattctttt caagtattca 22200 

ctgaacatat ccttagaccc taccctgggc cataaaacaa agctcactag tgattgccga 22260 

aggcttggat ggacagtgga agagctgcat ggggagggag aaggtgacag ttaaagagtg 22320 

taggatttct ttttgggata atgaaaatgt tccaaaattg attgtggtga tgttggcgca 22380 

actctacaaa tataaaaaag gccattgaat tgtacgtttt aagtgggtga aacatatggt 22440 

atgtggatta tatctaacgc tttttaaaaa cttaacacat ttcaaagaat agaagtcata 22 500 

cagagtgtgc tctactggaa tcaaactaga aagaggtaac tggaggataa cgagaaaagc 22 560 

ctccaaatac ttgaaaactg gacagcacat ttctaaaatc atccgtgggt caaagatatt 22620 

catttctgat attcattttt attgtttaat gtatttttaa aaatttctta agggaaataa 22680 

actgactaaa aatgaatatg gctgggtgcg gtggctcacg cctgtgatcc cagcactttg 22 740 

ggaggccgag gctggtggat cacaagatca ggagttcgag accagcctgg ccaagatggt 22 800 

gaaaccccgt ctcaactaaa aaactacaaa aagtagccaa gcgcagtggc gggagcctgt 22 860 

ggtcccagct acttgggagg ctgaggtagg agaatcgctt gaacacaggc agcagaggtt 22 920 

gcagtgagcc aagattgtgc cactgcacgc cagcctgggc gacagagact gcctcaaaaa 22 980 

aaaaaaaaaa aaaaagaata tcaaaatttg tgggacatag ttaaagcaat gctgagaggg 23040 

aaatttataa cactaaatgt ttacattaga aaagagaaaa agtttcaaat caatagtctc 23100 
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aatgaaatac 

aaacgctgat 

gttggaaaac 

ttattactac 

gtatacggaa 

agtgggaaga 

gatattgaca 

acacacaaat 

atatgtctct 

gaacctcaca 

acatctataa 

ctt tgggagg 

acatggagaa 

ctgtaatccc 



agaagatgaa 

cagtaaaatt 

cgaaagatta 

cacggattac 

ttgacaatgt 

aatatgaaat 

taacactctt 

atacccttcc 

tgtgaaaaat 

ttttaactct 

atgtttaatg 

ggaacatctc 

gacagctact 

aaaaaaaaaa 

tggctcacac 

aggagtttga 

ttagccaggc 

ccaatatcct 

agtagaagca 

cgcaagtctg 

aaaactacat 

ttcatgatac 

ttacaaaaga 

ttcccctacg 

ccctgaagtt 

caaagaagaa 

aagcaactag 

taggaggagt 

attcttgaaa 

gtgaagtaat 

atgataaata 

gtggctcata 

cgggagtttg 

aattagcagg 

agaattgctt 

ctggtgacag 

tgcaaagcag 

caggcagatc 

tctactaaaa 

ggcaggagaa 

cctgggtgag 

aacactaaaa 

tacaaaaatc 

aatgcacaat 

accacatctc 

ggtactaaaa 

ttacatataa 

aaaagaaatc 

ttaatatagt 

caaaatccca 

atatgcaaag 

atcagtctat 

gagggacagc 

atgcccaaat 

cattaaaggg 

ccctacagaa 

aacatttaga 

ccaaggcagg 

gccccatctc 

agctacttgg 



gagcaaaata 
gaaaacagaa 
ataaaattga 
cagttattag 
agatgaaatg 
agataattgg 
aaaacagaaa 
ttaacaaata 
atcttcagaa 
caagaagcaa 
aagaattacc 
ccagttcatt 
cttgacacac 
gaagcactgg 
ctgtaatctc 
gactagcctg 
agggtggtgg 
tcatgagtat 
atatataaaa 
gttcaacatt 
aatcacatca 
tctaataaga 
ctacaaaagc 
atcaggaaca 
ctaacttgtg 
ataaaactgt 
gggtaggggg 
aagttcaaga 
atactaaaag 
gcatacgtta 
cacacaattt 
cctgtaatcc 
agaccagaat 
atgtggtggc 
gaacaaggga 
agtgagactc 
ccaagcgcag 
acaaggtcag 
aatatataaa 
tcacttgaac 
agaacgagac 
ctactaagtg 
aattgagctg 
gattgtgcct 
tatttaaaaa 
ttaaaaacat 
atctaacaaa 
aaagaagact 
aaagatgcca 
gaaaaatttt 
gaactagagt 
ccagtttcaa 
tatagatcaa 
gatttctgac 
tgtagagtca 
aaattaactc 
aaaaggccac 
tggatcacct 
tactaaaaat 
gaggctgagg 



aacccaaagc 

acacaataaa 

caaacctcta 

aatgaaagca 

gactaattac 

gatagcctga 

cattaaactt 

aaaacgacaa 

aaatagaact 

atatctgggc 

accaactcta 

ttatgaagtg 

tgcctatggg 

acaagggcag 

agcactttgg 

gccaacatgg 

ggaaaataaa 

agacacaaaa 

ataattatac 

tgaaaacaag 

atcaatgcag 

aaaataagaa 

ttacagctaa 

aagcaaggat 

caaaacgata 

tcctgtttgc 

gcagtggaga 

tacctattgc 

agtgggtgtt 

attagcacaa 

tatctgtcag 

cagcatttta 

ggtcaacata 

gtgcacctgt 

ggcagaggtt 

catctcaaaa 

tggctcatgc 

gagt ttgaga 

ttagccaggc 

ccggaggcag 

tccgtctcaa 

aattcagtaa 

gacaaaggag 

gtgaatagct 

aaaaaaaatt 

aataaatact 

atgtgcagga 

taaatagcgt 

attttatcca 

acatagatat 

agctaaaaca 

gacttacata 

tgcaaccaaa 

aaaggtgtta 

ttgcacatct 

aaaatgactc 

gcacggtggc 

aaggtcagga 

acaaaattag 

catgagaatc 



aagcaaaaga 
gaaaatcagt 
gcaaggctaa 
taattagaaa 
tgaaaaaaca 
taactactga 
aatattttat 
attattttgc 
ttgtttgaag 
ccagatggtt 
catagcatct 

ggtgttactc 

tagctctgct 
tataaaaaaa 
gaggctgacg 
taaaaccctg 
aaggaaaaaa 
ctccttaaac 
accatgatca 
gtaacccact 
aaaaaagcat 
taaaggggaa 
cctatactta 
gttcactctc 
agaaagggaa 
agatgacatg 
cacgctggtc 
acaacatggt 
aagcgttctc 
cgtatattac 
tttaaaaaca 
ggaggctgag 
gtgaaatccc 
agacccagct 
gcagtgagct 
aaaataaaat 
ctgtaatccc 
ccagcctgac 
atgtgtagtc 

aggttgcagt 

aaaaaaaaag 
gtctttagga 
gattgtttta 
gctgtgctcc 
gtatctctat 
gtttttaatt 
cttgtgtgct 
gaaatatacc 
aattattaca 
agacaagatc 
aatttgaaaa 
gctacagtaa 
tagagaacta 
aaacacttca 
ataggcaaaa 
aaggactaaa 
tcacgctcgt 
gtttgagacc 
ctggacgtgg 
gcttgaaccc 
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ggggggcaga ggttgcggtg agccaagatc acaccattgc actccagcct gggcaacaag 26760 

agcaaaactc caactcaaaa aaaaaaaaaa aaaggaaaaa tagaaaatct ttgggatgta 26820 

aggcgaggta aagaattctt acacttgatg ccaaactaag atctataagg ccagtcgtgg 26880 

tggctcatgc ctgtaattcc agcactttgg tcaactagat gaaaggtata tgggaattca 2694 0 

ctgtattatt ctttcaactt ttctgtaggt ttgacatttt tttagtaaaa aattggggga 27000 

aagacctgac gcagtggctc acacctgtaa tcccagcact ttgggaggcc ggggcaggtg 27060 

gatcacacgg tcaggagttc gagaccagcc tggccaacat ggtgaaaccc cgtctctacc 27120 

aaaaatataa aaaattagcc gggtgtcatg gtgcacgcct gtaatcccag ctactgagga 27180 

ggctgaggca ggagaatcac ttgaacctgg gaggtggaag ttgcagtgag ccgagattgt 27240 

gccactgcac tccagccttg ggtgacagag cgagactccg tctcaaaaga aaaaaaaaaa 27300 

aaagaatatc aaacgcttac tttagaaact atttaaagga gccagaattt aattgtatta 27360 

gtatttagag caatttttat gctccatggc attgttaaat agagcaacca gctaacaatt 27420 

agtggagttc aacagctgtt aaatttgcta actgtttagg aagagagccc tatcaatatc 27480 

actgtcattt gaggctgaca ataagcacac ccaaagctgt acctccttga ggagcaacat 27540 

aaggggttta accctgttag ggtgttaatg gtttggatat ggtttgtttg gccccaccga 27600 

gtctcatgtt gaaatttgtt ccccagtact ggaggtgggg ccttattgga aggtgtctga 27660 

gtcatggggg tggcatatcc ctcctgaatg gtttggtgcc attcttgcag gaatgagtga 27720 

gttcttactc ttagttccca caacaactgg ttattaaaaa cagcctggca ctttccccca 27780 

tctctcgctt cctctctcac catgtgatct cactggttcc ccttcccttt atgcaatgag 27840 

tggaagcagc ctgaagccct cgccagaagc agatagtgat gccatgcttc ttgtacagcc 2 7 900 

tacaaaacca tgagcccaat aaaccttttt tctttataaa ttatccagcc tcaggtattc 27960 

cttcatagca agacaaatga accaagacag ggggaaatca acttcattaa aataatctat 28020 

gcagtcacta aacaaataag aacaagaggc tccagaagtg ggaagccaat acccagagtt 28080 

cctacaacac agtatctgaa aagtccagtt tccaaccaaa aaatatatat atacaggccg 28140 

gacatggtag cttatgtctg taatcccagc actttgggat gctgaggcgg gcagatcacc 28200 

ctaggtcagg agttcgagac cagcctggcc aatatggcaa aaccccgtct ctactaaaaa 28260 

cacaaaaatt agccaggcat ggtggtggat gcctgtaatc ccagctactc gggaggctga 28320 

ggcagggaat cacttgaacc caggaggcag aggttgcagt gagccgagat cacgccactg 283 80 

aactccagcc tgggcaacaa agtgagactc cacctcaaaa aaaaaaaaaa tatacatata 28440 

tatatgtgtg tgtgtgtgtg tgcgcgcgtg tgtgtatata cacatacaca tatatacata 28500 

tatacagaca cacatatata tatgaagcat gaaaagaaac aaggaagtat gaaccatact 2 8560 

ttctgtggtt atgataggat ggggtatcac gggggaagta gacaagggaa actgcaagtg 28620 

agagcaaaca gttatcagat ttaacagaaa aagactttgg agtaaccatt ataaatatgt 28680 

ccacagaatt aaagaaaagc gtgattaaaa aaggaaagga aagtatcata acaatattac 2 874 0 

tccaaataga gaatatcaat aaaggcatag aaattataaa atataataca atggaaattc 28800 

cggagttgaa aggtagaata actaaaattt aaaattcact agagaaggtt caacactata 28860 

tttgaactgg cagaagaaaa atttagtgag acaaatatac ttcaatagac attattcaaa 28920 

tcj*iaaaataa aaagaaaaaa gaatgaagaa aaataaacag aatctcagca aaatgtggca 29980 

caccattaat cacattaaca tatgcatact gagagtaccg gaagcagatg agaaagagga 29040 

agaaaaaata ttcaaatgat ggccagtaac ttcctagatt tttgt tttaa agcaataacc 29100 

tatacaatca agaaactcaa tgaattccaa gtaggataaa tacaaaaaga accacaaaca 29160 

gatacaccat ggtaaaaatg ctgtaagtca aaaacagaga aaatattgaa agcagctaga 29220 

ggaaaactta taagagaacc tcacttacaa aagaacatca cttataaaag aaccacaata 29280 

atagaaacag ttgacctctc atcagaaaca atgaatgata acatatttga agtgctcaaa 29340 

gaaaaaaaat aaagattcct atatacgaca aagctgtctt tcaaaaatat acatccaaaa 294 00 

ggattgaaac cagggtcttg aagagttatt tgtacatcca tgttcatagc agcattattc 2 94 60 

acaatagcca aaaggtagaa gcaacccaag ggtccatcga caaataaata aaatgtggta 29520 

tatgtataca caatggaatt tattcagtat taaaaaggaa tgaaattctg acacatgcta 29580 

caacatggct aaaccttgag aacactatgc taagtgaaat aagccagcca caaaaggaca 2964 0 

aataccatat tacttcactt gtatgaaata cctagggtag tcaaattcag agatagaaag 29700 

taaaacagtg gttgccaagg gctgagggag ggagtaacgt ggagttattg ttgaatgggt 29760 

acagaatttc agttttgcaa gataaaaaga gttctggaga cagatggtgg tgagggtggt 2 9820 

acaacaatac aaatatactt tatactactg aacagtatac ttaaaaatga ttaacatggt 29880 

gaaaccccgt ctctactaaa aatacaaaaa aattagctgg gtgtggtggc gggcacctgt 29940 

aatcccagct acttgggagg ctgaggcagc agaattgctt gaaaccagaa ggcggaggtt 30000 

gcagtgagct gagattgcgc caccgcactc tagcctgggc aataagagca aaactccgtc 30060 

tcaaaaaata aaaaataaaa aaaatttaaa aatgattaag caggaggcca ggcacggtgg 30120 

ctcacaccta taatgccagc actttgggag gccgaggcag gcgatcactt gagaccagga 30180 

gtttgagacc agcctggcca acatggcaaa accctgtctc tgctaaaaat acaaaaatta 30240 

gccaggcatg gtggcatata cttataatcc cagctactgg tgagactgag acacgagaat 30300 
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tgcttgaacc 
tgggcgacag 
aaaaataatt 
ggctcatgag 
atgtatctta 
cccagagaaa 
aggctgaaaa 
cagtaaaggt 
aaatgattta 
acatatatag 
caaagctgtt 
taaaaataaa 
gtaatattaa 
ataggaataa 
taagttaaga 
aaaaaataaa 
taccacaaat 
aacacctgca 
cctcccttgt 
tagctatatc 
atgaaatcaa 
aagatcaacc 
ttactagaat 
cgaacaaatg 
acatcaacta 
gaagagactg 
ttcactgtga 
ctccaaaaaa 
gaactactaa 
ccaatataca 
ttaagaaaat 
ttattcaagt 
aaaggtttac 
aatgctctcc 
tgggggtggc 
cc tgaggt eg 
aatacaaaat 
aggcaggaga 
tgcactccag 
ccagatgact 
accttgagaa 
gecaattgea 
acacatacat 
tctatggtaa 
ctcaccatgt 
cctcccaaag 
aaaagtcaac 
aaaaaaaaaa 
caccacatcc 
aggtctggga 
ecatatgeaa 
aaaactctta 
cttagacatg 
aaaatgaaaa 
aaggcaatat 
acagtaacca 
ataaagaacc 
ccttcaataa 
cctatctctc 
ccaaaactat 



caggaggcag 
agcaagattc 
aagcaggaaa 
agcacaaaac 
ccacaaaaaa 
caaaagtaga 
caagtgaccc 
tattctgtaa 
aaaagcaatt 
aaatatactt 
actaggctaa 
attttaaaaa 
tagacataat 
cattttggta 
cacacatgtt 
taaaataatt 
tgagtggctt 
atcaaggtga 
ctcttccagc 
attcctagca 
aaggagaaaa 
aagttaacaa 
cagaaataaa 
tgtgccaaca 
ccaaaattta 
aattgacaac 
aattctttca 
aagaacagat 
gacactatga 
aaaatctatt 
aattcaattt 
aagtgcaaaa 
ataaatgaaa 
aaattgatct 
ggttcatgcc 
ggagctcgag 
tagtcaggcg 
ategcttgaa 
cctgggcaac 
tcactgttga 
tagecaaaac 
aatgttacga 
acatacatat 
gtgettttet 
tgcccaggct 
tgctgggata 
aagaccattc 
tgaagttgga 
ageccaaatg 
taatcagata 
aaattaattc 
gaaggaaaca 
acaccaaaag 
acctttgtgc 
taagcaaaaa 
aaacagcatg 
caaaaataaa 
atgatactag 
accatataga 
aaaactactg 



agattgeagt 
tgtctcgaaa 
egagattget 
ttttcaaaaa 
aagggctggg 
gaatttgttg 
cagagggtaa 
ctatgacact 
gcataaaata 
gtaatatatt 
agaaattact 
atttaaaaat 
acaaaaatac 
tctaactaga 
aaaccctaga 
aaaatgtttg 
aacacaactt 
gtacagggee 
ttccagtggt 
accagaaaga 
atggaaaaaa 
accttttaac 
agaggggaca 
aattagaaaa 
ctcaagaaga 
caagaaacta 
aacttataaa 
ctctatttac 
taactgataa 
atatttctat 
acaataacat 
cttatactct 
aactatccca 
ataaattcaa 
tgtaatccca 
ateagectga 
tggtggcaca 
cccaggaggc 
aagagcaaaa 
aattgaaaag 
aaacttgaaa 
cacagcaaca 
caatggaata 
atttttttct 
ggtcttcaac 
actggcatga 
ttttcaacaa 
ccctccatca 
attttcaaaa 
gtcacatgaa 
aaaaatgaat 
tacgggtaaa 
catgaccaac 
tggaaaggac 
gaacaaagct 
gtactagtag 
tccacatatt 
gaaaactgga 
aaaatcaact 
gtagaaaaca 



70/122 

gagtcgagat 
aaacaaaaac 
gctgaggagg 
atgtttaatg 
gggcaggaaa 
ccttagaaga 
tctgaattct 
aacaatgeat 
ttatatataa 
tgcaaataac 
acagatagta 
aataattaca 
cacaaaaagg 
attaaattat 
tactaaaaag 
tattagtttc 
aaatgtattt 
atgctccctg 
tctcagtaac 
agaaaataat 
ataaataaaa 
tagactgaca 
ttactaatga 
cttagatgaa 
aagagacaat 
tccacaaaga 
tataaattaa 
aggegatacg 
acaagttcag 
acacttgeag 
caaaaagaat 
agaagctaca 
tgttcatgga 
caaaatcctt 
gcactttggg 
ccaacatgga 
tgcctataat 
agaggttgea 
ttccatctca 
attattctaa 
aacacgaaca 
gtaatcaaga 
taattgagag 
tttttttttt 
ttctgggctc 
gccaccacat 
ataggtctgg 
cactaaagtg 
aggtcaacaa 
aaaaaaaatg 
tgatgactta 
tcttaaagac 
taaggtaaaa 
accatcaaga 
ggaggcatca 
aaaaacagac 
tatagtcaac 
tatcgatatg 
cagactgaat 
taaggaaaaa 



cgcgccactg 
aaaaacaaaa 
agaaagatgt 
attaaaatgg 
tgaaggtgaa 
aacaccacag 
cacagaaaat 
attttttcct 
agectattgt 
tgcacaaaag 
aagtaatata 
acaataatat 
gaagaagaca 
aaatatgaag 
taactcacat 
ctcagggtac 
tctcccagtt 
tgaaggctct 
ectaagtget 
aaagattatg 
ccaaaagcta 
aaaaggaggt 
gggattagaa 
atggacaggt 
ttgaatgagc 
aaatcccagg 
catcagttct 
atctttagaa 
caaggctgea 
tgaacaaccc 
aaaaacactc 
aaacactgtt 
tcaaaagact 
atcaaaatcc 
aggctgaggc 
gaaaccctat 
cccagctact 
gtgagccaag 
aaaaaaaaaa 
aattcacatg 
aaatatagga 
ctgtgtggta 
tacagaaaca 
cttttttgta 
aagcaatcct 
ccagcccaga 
gatgatcaga 
ctgegattat 
gaccattctt 
aagttggacc 
aacgtaagag 
gttaggtttg 
tagggtaaat 
aatggaaagc 
tactacctga 
acatagacca 
tgatttttga 
cagaagaata 
taaagacttg 
cgcttcagga 



aattccagcc 
agcaaaacca 
gcaggaccaa 
taaattttat 
ataaagacat 
gaagttcttc 
tgaagcatag 
ttcttctctg 
tgaacctata 
agagttggaa 
acagggaact 
ggttgggttt 
atagaactac 
tatattctgg 
aaatacagta 
agtaacaaac 
ctggaggcta 
aggaaagaat 
ccttggcttg 
gcaaaaaata 
gttctttgaa 
aagactcaaa 
aagaatacta 
tcctaggaca 
tataacaagg 
cccagaagat 
tcacaaactc 
aatcctaagg 
ggatagaaaa 
aaaaatgaga 
aaaaataaat 
aaaagaaatt 
tattactggc 
cagatgaggc 
aegcagatta 
ctcttctaaa 
egggaagctg 
ategtgecat 
aaaaaaaatc 
gaattgcaag 
tgactcactt 
ctggcaaaag 
agectaaaca 
gagatagaat 
cccactgtgg 
tgattttcaa 
tagtcacatg 
aggcatcagc 
ttcaacaaat 
ctccatcaca 
ttacgactgt 
acaaagaatt 
tgtacctacc 
caaaatagee 
cttcaaagca 
atggaacaga 
caatgacacc 
aaactagacc 
aatgtaagac 
cattggtcca 
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32700 
32760 
32820 
32880 
32940 
33000 
33060 
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33240 
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33420 
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33540 
33600 
33660 
33720 
33780 
33840 
33900 
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ggcaaagatc 
tagcacttta 
aacctgtaga 
gaacacacaa 
gggcaaaaga 
ccagtaccac 
gtcatcctac 
agtataaaat 
aaaatcaaga 
acaaaatata 
tagggatgca 
gaaaatagaa 
aagtcagtat 
gccaagacag 
catatacact 
gcaacatgga 
aaacagtaca 
gaaggggagg 
ttttagtgtt 
caaaaagcta 
gatggatatc 
cactctgtac 
aattaagaca 
aatatttaat 
cccaattcaa 
ggccaataaa 
aaccacaacg 
taataacaaa 
atgtaaagtg 
agagttaccg 
tatatccaca 
gtggtaagaa 
atacactaga 
gaatctcgga 
ttccatgcat 
gttgggtggg 
accacaaact 
ctctgagatc 
tcccaggcct 
ttttctctgt 
gctatgcaca 
aacttatagt 
tttttatcta 
tagcaggaat 
gggatggtga 
aagctggaag 
ggctggagct 
ctgacttccc 
tggccagaga 
gagacagaga 
gtactatctt 
aattgagaga 
gagaagagaa 
gaatgaaaag 
cacgctgtac 
agtcaaattc 
attaaataaa 
tagcaaacga 
taatattctg 
ttggtctata 



ttatggctaa 

ttaaactaaa 

atgggagaaa 

gtgactaaaa 

tctgaataaa 

actgtcttga 

actttgttct 

agccaacaag 

ccactatgag 

atggatgctg 

aattggtaat 

ctaccatatg 

actgaagaaa 

ggaataaatc 

caatagaata 

tgaacctgga 

tgttctcact 

cttgggaaaa 

ctatagaact 

gaagagattt 

ctaattaccc 

ctcataaata 

acccacataa 

atttataata 

aaatgggtaa 

gacacgaaaa 

tagaatgtag 

tgttggtaag 

atgcagccac 

tatgacccag 

taaaaacttg 

cccatatgcc 

atattatctg 

aaccttatgc 

cggaaatgac 

gctgggagga 

ggggagctta 

aaggtgtcag 

ctctccttgg 

gtgtgcccat 

aagtgaagtc 

cattttaatg 

cattgtgcaa 

ttaatatggg 

gccaacccag 

gataaaggag 

gggaccaccc 

ccttcctccc 

cagtgacaag 

aatatggaag 

atttatcttt 

aactgaaaac 

aaatttattc 

ctataatcag 

aacctgaagg 

taagtgcttt 

atggaaactt 

ttctggaatt 

ctgacctcct 

gtttacatct 



aacctcaaaa 

aagctcctgc 

atatttgcaa 

caactcaaca 

cattctcaaa 

ttacttgtta 

tgtttttcaa 

tatgaaaaaa 

atatcctctc 

gcaaagattt 

ggccattatg 

atccagcaac 

tatatgcact 

taaatgtgca 

ctattcagcc 

ggacattata 

cagacatggg 

gttaatggat 

gtagggcgag 

tggatgttcc 

tgattcaatc 

tgtataatta 

tggaagaaat 

tataaagaac 

aagccttgaa 

gatgctcaac 

acaccacttc 

gatgtgaaaa 

tttggaaaac 

gaatattcct 

tacatgggca 

catcatctga 

cccatacaag 

taagtgaaag 

cagaataggg 

caggtagtac 

aacatagaaa 

cagagctggt 

ctggcaggtg 

gtccaaattt 

tacttccaaa 

tccgcttttc 

agtttaataa 

aactaattac 

agattagcaa 

gggctattat 

tagagacact 

acctttcaat 

gaacactgca 

ggtagaaaat 

gtatctccag 

tccaattgaa 

cgcatagagt 

caaagatttg 

cacaatgcat 

tccagaatct 

actaaacttt 

cctagagtaa 

tttgctattt 

acgggcttat 



acacaggcaa 

acagcaaagg 

actatccatc 

gcaaaaaagc 

ggaagacata 

gtgtataaat 

gtttgttttg 

tgctcaccat 

actccagtta 

ggagaaaggg 

gaaaataata 

cctactactg 

ctcatgttaa 

tcaacagatg 

attaaagaag 

tttaatgaaa 

tgctaaaaag 

aaaaatttac 

tatagttacc 

cagcacaaag 

attacacatt 

ttacgtcaac 

aaaatatctg 

tcctacaact 

tatacactta 

atcactagtc 

atatgcacta 

aatcagaaac 

agtctggcag 

cctgggtcta 

tttatagcaa 

tgaacaggta 

gagtgacatc 

aagccagtca 

aaatctatag 

actactttcc 

ttgatttcct 

tctttctgag 

gccatcttct 

tgattggctc 

agaagggaag 

ctatgagatt 

gaaaaataga 

aaggtttagg 

cagtgggacc 

cagagtccac 

gtgcaaagca 

ctcccactag 

aaatgaagtt 

gaatcagagg 

tgcctaatct 

atgaaagaat 

aaacaagaat 

ccagagaaat 

gaaaacgttt 

ctcaagacga 

ccccttgtat 

aatatatttc 

aggatatttg 

actgttcttt 



caaaaacaaa 

aaacaacaga 

catcaaggga 

aaataatctg 

caaatgtcac 

ttttaaattg 

gctattctgg 

cactaatcat 

gaatggctac 

gaactcctat 

ctgaggtttt 

ggtatttatc 

ttgcaacact 

aatggataaa 

aatgaaatcc 

taagtaaagc 

aaaatggggt 

agctatgtaa 

aataacttat 

gaatgataaa 

gcatacatgt 

aaaaaaagga 

caaattatat 

caagaacaac 

tctaaagact 

atcagggaaa 

ggatggctag 

ctcattcgct 

ctcctcaaat 

taaccaaaaa 

cattattcat 

aataacatgc 

cagctacatg 

caaatgacca 

agacagaaag 

cagaactact 

cacagttctg 

ggccctgagg 

ccctgcgtct 

attctgggtc 

agggaacact 

gtgaacacac 

attcaagaga 

gcaggactaa 

ccatctacct 

aagccagtgt 

gaaaacaagg 

tgcttcctac 

tgtaggaatc 

ataaagagaa 

gtctctcaaa 

ggagaattac 

ggattcacaa 

taaaaagtgg 

caagaaatga 

ttatatagct 

taaactaaca 

gtcaaagtgt 

tatacacatc 

ttttcatttt 



aatggaaaaa 

atgaaaagac 

ctagtatcca 

gtttttatat 

tatcattctg 

ggaagtgtga 

gagccttgca 

cagagaaata 

tatcaaaaag 

acactgtggg 

tcaaaaaact 

caaaggaaag 

gttcacaaca 

gaaaatgtgg 

tgtcatccca. 

acaaaaagat 

cacagaatta 

gaagaataag 

tgtacatgtt 

tgtttgtgat 

atcaaattat 

aaaaaaagaa 

atatctgata 

aacaaaacaa 

atatacaatt 

tataaatcaa 

aataaaaagg 

gctgttggga 

tattaaatac 

aatgaaaaca 

aacagcaaag 

ggtattatcc 

ctacaaggat 

cagattatga 

tagattagtg 

ggaacaaagt 

gagactagga 

caaggctctg 

tcacatcatc 

atggccaatt 

gactaggcta 

agaagtaggg 

agcagttcaa 

aaagccagtt 

accacccatg 

cagagtcctt 

gggaaaaacc 

tagccatact 

atctccctct 

aaaaccctga 

aaaggaaagc 

tggactagaa 

aggacgtgat 

taaactcagc 

caagatttga 

accccatttt 

tatgtcctaa 

attgctcttt 

acacgtaaat 

tttaaaattt 



33960 
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34080 
34140 
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34260 
34320 
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34440 
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34560 
34620 
34680 
34740 
34800 
34860 
34920 
34980 
35040 
35100 
35160 
35220 
35280 
35340 
35400 
35460 
35520 
35580 
35640 
35700 
35760 
35820 
35880 
35940 
36000 
36060 
36120 
36180 
36240 
36300 
36360 
36420 
36480 
36540 
36600 
36660 
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36840 
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36960 
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37200 
37260 
37320 
37380 
37440 
37500 



-NJSDCOD- <WO 01 27957 A? t > 



WO 01/27857 PCT/USOO/28413 



72/122 



ccaaccccca 

ctgagatgct 

taccaccatc 

agagtctttc 

ctctcacccc 

ccctcttccc 

gctcccaacc 

agcaactgcc 

ttgtgtcttg 

ttcattagca 

ttaagtctta 

cagaccaaat 

ggagcagagc 

ctatagctct 

atttcactaa 

cgatccagaa 

gtcagtattc 

tactgtgttc 

catcatcacc 

taaagtcaaa 

gcagcccagg 

atacttttca 

tactaaatgt 

tgactattca 

gacgtataac 

tctaaaatag 

ttatgtcaat 

agtaaacaga 

ctgtaatccc 

ccatcctggc 

gtagtggtgg 

acccaggagg 

ctgagtgaaa 

gtaaataaat 

tgcaaagacc 

tttgggaggc 

cagtgaaacc 

tattatatat 

gtgcagccac 

cacgaccagg 

aggagacatg 

tctacaggag 

agagatgggt 

tcatgtaaat 

ggtttgaact 

acagcaagac 

aaggatgaag 

ttccttatga 

catcacacgc 

cagtaggctg 

gcaggcaatc 

gtattatatg 

aaacggaatg 

catgggccag 

ctgaggtcta 

gcactttggg 

aacacagtga 

gcctgtagtt 

ggttgcaggg 

gtctcaaaaa 



gtatccatat 

ttccatgttt 

aaccttggat 

ttgtcattcc 

atggaatttg 

ccttcattta 

cctgctgccc 

tgctccctca 

tccatcacta 

aaatgttatg 

agactatggt 

gaagagacca 

taagtagttc 

tcacagaaaa 

accatagttt 

aaggttgaaa 

tgctgccatg 

tcaatgccga 

atgcttgttt 

tgacactagt 

ccctagcaac 

ccctttcaag 

tgtcaacaga 

ggttatagaa 

atattaggag 

tctatattgg 

ggaaactcaa 

cagatgcaaa 

agcactttgg 

taacatggtg 

gcaccagtag 

cggagattgc 

ctccatctca 

aaaaagagag 

agggccttgg 

caaggcgggc 

cggtcnctac 

atatatatca 

ccttgacagc 

cagtcctact 

tacgaggctc 

acgagatgga 

ggtgagtgac 

taaaaataga 

gcctgtgtcc 

caacccctct 

acttttatga 

ttttctttat 

aaaataaatg 

tcagtagtta 

agttcccctg 

aacctcatta 

aacaaataaa 

tgatgataaa 

aaaccaagga 

aggctgaggc 

aagcccatct 

agctactctg 

agccgagatc 

aataaaaaaa 



actgctctct 

ttttttttta 

tatttaagca 

tgctatcagc 

cagatgaagt 

gacatcacct 

aattgtgtgc 

tctgtctccc 

taatctcagc 

tataaccttg 

ttagaacatg 

tgttcattta 

caagggaaca 

agttttcaga 

tttgggtttg 

agaatgaatc 

ctgacaccca 

gtccacccac 

atccttaagg 

ggccaggagg 

agcaggagct 

agagactagg 

catgtcaaaa 

ttaaggattc 

aaactatgtg 

attccagttg 

aaagataaca 

taaaaagagg 

gaggccgagg 

aaaccccgtc 

tcccagctac 

agtgagccga 

aaaaatataa 

agactgctaa 

gatggccggg 

gga teat gag 

taaaagtaca 

gagccttggg 

aatctggcag 

cctgggtcta 

attcagcatt 

caaaatgtgg 

aatcctaaga 

tgcacacaaa 

acttacatgt 

tcttcctcct 

taatccaatt 

ctctagctta 

ttaattgact 

agttttggga 

accccctcat 

gaatagctgt 

ccaacaaatg 

gggctaagaa 

aagggagggc 

gggeggatea 

ctacaaaaaa 

gaggctgagg 

acaccattgc 

ataaaaaaac 



atcagggtta ttttaacttt 
ttttctgeca catttgaata 
ttcacgattc cacgtgtgga 
acagaaccca atctcagctt 
tcaaaaggac etttgeatta 
tcttctagaa cgtcttacct 
tctcccgtgt cctggcctgc 
cacccagaca ttaagctgaa 
acctagtacc tagtaggtac 
caccttaaaa acaagagaag 
gatcagaaac tacagtctgc 
catacaacct atagcagctt 
cacggccctg caaagectaa 
tccctcgttt agaactcttg 
tttggttttt tttggcaaaa 
attactgetg aaagaatgtg 
tccaatagtg tcatgagatg 
tccataacca tgtccaagca 
tattgectea catacagcag 
tcaagagaat gagtgaggac 
cacccctcag tcactctagc 
aatctggatt tttatgtgaa 
ggtaaaacta agtaagttca 
ttatccaaca cagataccaa 
cactgtcgaa acatcaacaa 
aaacatgggg aaaggacatg 
agcatatata aaagcattct 
gaaactgctg ccgggcacag 
egggeggate atgaagtcag 
tctactgaaa acacaaaaaa 
tcaggaggtt gaggcaggag 
gaccatgcca ctgcactcca 
taataattat aattataata 
agtctagaaa gttgaatgat 
tgcagtggct cacgcctgta 
gtcaagagat caagaccatc 
aaaaaatata tatatatata 



aatccttgtg 
tacttggtta 
aatcccaaag 
actgggagtg 
tggatattaa 
tacagaataa 
gcagtatacg 
ggattttctt 
ccccctcagc 
ccaaggaact 
cattattcta 
gtttatatta 
gtcaaaagtt 
tgttcacggg 
ctatagggag 
cattaacaag 
tgagaatata 
caggcgtgga 
caagattagg 
tacaagaatt 
caggagaatc 
actccagcct 
agagaaaggg 



tgctgctggg 
tattaagtat 
aattctcaca 
ggaatcaacc 
gaccagaatc 
aggctagaac 
cgtgaccctt 
ccacttctgc 
ctactcaaca 
aatgaaaagt 
agaatatggt 
tgggtaaggc 
atacacagat 
tcaactgtat 
aagagaatga 
caaaacaaca 
attaattcaa 
ggctcacgcc 
agtttgagat 
acccaggtgt 
acttgaaccc 
gggtgacaga 
aggaaactag 



gtaaaatcag 

gcataggagt 

ttttttattc 

tccagctata 

tcctgcctcg 

gacatgccct 

catcctcttt 

tagactggat 

ttaccatgta 

gaagacaaaa 

agcccaaatc 

tcacactaca 

aatatttact 

ttcatatgea 

aggaatgagc 

cacacagtcc 

cagcagctac 

atcttgggaa 

tggctggtca 

aggtgggtag 

caggactgaa 

atatcttgat 

t9999cagat 

ccaaaaagct 

ggggctaatg 

aacaggcaac 

caaattcagt 

tggctcacac 

gagatcgaga 

ttagecagge 

aatggcatga 

gectgggega 

ataataaata 

gccaagcgca 

atcccaccac 

ctggccgaca 

tatattatta 

gaaggtagtg 

aggcacacac 

caagtccata 

tgggtgtcca 

accaagtaac 

atgatgecat 

gaatagcaca 

tacccccaag 

tgaagatgac 

atattttctc 

acataataca 

ttccactcaa 

tttcaactgt 

atacacaaaa 

gagtgggata 

gaggggcttg 

ttcctcacac 

tgtaatccca 

cagcctggcc 

ggtggcacat 

aggaggegga 

gtaagactct 

atccaggctg 



37560 

37620 

37680 

37740 

37800 

37860 

37920 

37980 

38040 

38100 

38160 

38220 

38280 

38340 

38400 

38460 

38520 

38580 

38640 

38700 

38760 

38820 

38880 

38940 

39000 

39060 

39120 

39180 

39240 

39300 

39360 

39420 

39480 

39540 

39600 

39660 

39720 

39780 

39840 

39900 

39960 

40020 

40080 

40140 

40200 

40260 

40320 

40380 

40440 

40500 

40560 

40620 

40680 

40740 

40800 

40860 

40920 

40980 

41040 
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actagataca 
tgaatgaaat 
gaactcagac 
cagaataaaa 
cttccttcta 
tgcagggaga 
tcttgtggaa 
aagccagtga 
gggaagaaag 
gaagtggcag 
gaccatgtgt 
caatctcaac 
agttcaaaag 
ctctctcact 
catcaggaaa 
aggcaaaaaa 
tttttctgag 
actgcttttt 
aataaatgaa 
gagatcctcc 
ggcccaagac 
cctgggaact 
tgcctaaatt 
tggtatgtgc 
gggtctctag 
tcggggagtt 
gtgcctggct 
atcctttcac 
accaccagat 
aaggactggg 
ggaccagtta 
aacctgggag 
cttccaagtg 
agaaacctgg 
gtaaaaagta 
gaggctgtta 
tagaacagca 
caggactgtt 
gctcttcaac 
aaacttcgaa 
gacactgaat 
attattagat 
tgacgatctt 
atgctactat 
gcatgaagaa 
gcgagagaag 
tcttggtctg 
aggcggtgtg 
ctggtgggtt 
tacagctctt 
gctagcttca 
tgcagaccca 
tcagcagcgc 
attctcttat 
gctccatttt 
cagagtgctg 
tttacaatcc 
ctggcttcac 
gccctgcgcc 
ggggtggtgc 



gcctttagag 
tgaaaagcct 
aactcaaaca 
atcagctgca 
gtggttcttt 
catggggtat 
gattatacac 
caaagaagcc 
accaacatgg 
atctctgagc 
ggatttttta 
tttccagcta 
gatccttgcc 
gacaccctct 
tgttctccac 
attcatgacc 
gaacacagga 
aaaataaata 
tgatagggtc 
caccttggtc 
tgttattctt 
tagatttcag 
attgtgtggt 
tgggcaaagg 
tgagtttccc 
aagcacatac 
tcctttggac 
tgtaataaat 
ctgagcatgg 
catggtggcc 
agcccaaaag 
acagagcaag 
tcttaaaatt 
aaaacaagag 
ctcagaacca 
tgcagaagga 
gaggtagaac 
ggacccttcc 
aatcacagga 
gatgaacaca 
caacagaaca 
attcttggga 
tgtgatatac 
ttgtgcaaaa 
actttggaag 
taagaggaca 
acttggagaa 
tctggagttt 
cgtagtctcg 
aagggggcgc 
ggagtgaagc 
aagagtgagc 
ggaatgcgac 
ctggccacac 
acagagaacc 
attggtgcgt 
cttagctaga 
ccagtggatc 
cgcactcctc 
tgtcagggag 



ttagaaaaga 
ttcaaactaa 



ggtaatgtca 
tgtgaagcag 
ccgaaaacat 
ataactatga 
aatgaggcaa 
agtgatgaaa 
atgggggtga 
tggatgatgg 
ttcagctctt 
tattgagcta 
ttttcaaaat 
caaggctgct 
tcagtttcac 
atctgactgg 
ggaaaatctt 
aataaataaa 
ttctgtattg 
tcccacagtg 
aaaaagtctc 
aagggttccc 
ttatgctgaa 
gggcctgcat 
tggtagacag 
atcctgtgtg 
ttggccccat 
tacagccgtg 
tcctgggggc 
catgccggta 
ttcaaagtta 
accctgtccc 
caatggaatg 
tgccgatggc 
gattacctga 
aatggtaaca 
ctgacaaggt 
cctcacatgg 
ggcacgctac 
taaagaatca 
caaacccaag 
agacctaagg 
caagaaataa 
aaggagaaat 
gtacataagt 
ggaatggtgg 
tgaagccgtg 
gttccttctg 
ctgactcagg 
atctagagtt 
tgcagacctt 
agtaataaga 
cgcagcacgt 
ccatatcctg 
gattggtcca 
ttacaatccc 
cataaaggtt 
cggcatcagt 
agccctctgg 
gctcgggccg 



tgatttgaca 
aacatttaat 
gcgtggtgtt 
tgactagaat 
taataggcac 
cttactgttc 
caaaaactat 
ggccctgtga 
tcagggtggc 
gccactacca 
tcgtgtcatt 
aacttctcac 
aattttgaat 
gagcacgtgc 
cttaatacaa 
gagaagtcat 
acagaaaaga 
taaataaata 
gccaggctag 
ttgggattat 
ataaaaagca 
accatccaac 
ctcctgcttt 
gaccagcccc 
catttcacat 
actgcactgg 
gcacctttcc 
agtacaccac 
ccccaacaca 
atctcagcgc 
cagtgaccta 
caaaacaata 
gtagaaacat 
caactaaaat 
gcaaaccata 
ggtttccagg 
gattacctgg 
aatacacacg 
gcctagtaag 
ccaagttttt 
caaagataat 
ggacattata 
aaacacagga 
ggagaatctg 
aactaacaac 
gaacaccttt 
gaccctcgcg 
atgtttggat 
agtgaagctg 
gttcgttcct 
cgaggtgtgt 
acgcattcca 
taccactctt 
ctgattggtc 
tttttcagag 
tgagctagac 
ctcaagtccc 
gccacaggtg 
tggtcgatgg 
cacaggagcc 



atctaagccc 
tacaccatct 
ttatatcacc 
gaagaaaagg 
cagctctatg 
attcctcaag 
ccaataaaac 
gcagagctga 
tccgtgggaa 
tctgtatatg 
cctgctatca 
ctcatggaat 
ggttgagtag 
catgctatgg 
atgtgttctc 
ttctaggtaa 
gttaacacag 
aataaataaa 
tctcaaattc 
agacatgagc 
tggttaatcc 
ctggaaagag 
tcttcaggta 
caataaaaac 
gcgttgtcac 
gagaggatgc 
ctttgctgat 
atgctgagtc 
gaaataaatt 
tttgggaggc 
tgactgcgcc 
aactaaacac 
ttttaaaaca 
gtctaggaaa 
gcccaataca 
aacagacttg 
ggaactgcag 
ccactcagca 
acaggaaaaa 
attcagtatg 
tactagagca 
aagagcaagc 
tgaagaccag 
attcatattt 
aatggttacc 
tgtgtccgga 
gtgagcgtaa 
gtgttcggag 
cagaccttcg 
cctggtgagt 
gttgcagctc 
aacatcaaaa 
ggctcgggca 
cattttacag 
agctgattgg 
acagggtgct 
caccagactc 
gagctgcctg 
gactgggcgc 
caggaggtgg 



acactcagat 
gctgcagaca 
accctcaaca 
ctgcttctta 
catgtcaccc 
gaattcccaa 
cacggaaaag 
tggccatttg 
agctggaaga 
gctaattaaa 
gcacagaacc 
ttgcagataa 
tccctctgtg. 
ctttctccaa 
tcttcagaga 
agtgtccatc 
caggcctaag 
taaataaata 
ctggcttcaa 
cattgtgctt 
ttggctggca 
ggactcactg 
gcgtggaatg 
cctgggtgtt 
agctccttcc 
ttggaagctt 
tgtgctttgt 
ttccaagtga 
ataaaagacc 
cgaggcagga 
aatgcactct 
atacttctgc 
ctaaatcaaa 
tttctgaaaa 
agcttgggag 
taacagcaga 
tctgaatgac 
gcacaccaca 
aggaattctc 
atgaaacagg 
catagaagaa 
agttggtatg 
atagagaata 
gcttgtattt 
tacttgtaag 
attggtgggt 
cagttcttaa 
tttcttcctt 
cggcgagtgt 
tcgtggtctc 
atatagacag 
ggacaaacct 
gcctgctttt 
agagccgact 
tccattttga 
gactggtgta 
aggagcccag 
ccagtcccgc 
cgtggagcag 
gggtggctca 
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ggcatggcgg gccgcaggtc atgagcgctg ccccgcaggg aggcagctaa ggcccagcga 44760 

gaaatcgggc acagcagctg ctggcccagg tgctaagccc ctcactgcct ggggccgttg 44820 

gggccggctg gccggccgct cccagtgcgg ggcccgccaa gcccacgccc accgggaact 44 880 

cacgctggcc cgcaagcacc gcgtacagcc ccggttcccg cccgcgcctc tccctccaca 44 94 0 

cctccctgca aagctgaggg agctggctcc agccttggcc agcccagaaa ggggctccca 45000 

cagtgcagcg gtgggctgaa gggctcctca agcgcggcca gagtgggcac taaggctgag 45060 

gaggcaccga gagcgagcga ggactgccag cacgctgtca cctctcactt tcatttatgc 45120 

ctttttaata cagtctggtt ttgaacactg attatcttac ctattttttt tttttttttt 45180 

tgagatggag tcgctctctg tcgcccagac tggagtgcag tggtgccatc ctggctcact 45240 

gcaagctccg cctcccgggt tcacaccatt ctcctgcctc aacctcctga gtagctggga 45300 

ctacaggcaa tcgccaccac gcccagctaa ttttttattt tatttttttt ttagtagaag 45360 

cggagcttca ccatgttagc cagatggtct caatctcctg acctcgtgat ccatccgcct 45420 

cggcctccca aagtgctggg attacagacg tgagccactg cgccctgcct atcttaccta . 45480 

tttcaaaagt taaactttaa gaagtagaaa cccgtggcca ggcgtggtgg ctcacgcctg 45540 

taaccccagc actttgggag gccgaggcgg gcggatcacg aggtcaggag atcgagatca 45600 

tcctggttaa cacagtgaaa ccccgtcgct actaaaaata caaaaaatta gccgggcgtg 45660 

gtggtgggca ccggcagtcc tcgctactgg ggaggctgag gcaggagaat ggcgtgaacc 45720 

tgggaggcag agcttgcagt gagccgagat agtgccattg ccttccagcc tgggcgacag 45780 

agcgagactc cacctcaaaa aaaaaaaaaa aaaatagaga cccggaaagt taaaaatatg 4 5840 

ataatcaata tttaaaaaca ctcaagagat gggctaaaga gttgacggaa caaatctaaa 45900 

tattagattg gtgacctgca aaaccagccc aaggaacatc ccagaatgca gcccataaag 45960 

ataaagagag catttccgct gggcacagtg gtatggcagg ggaattgcct gagtccaaga 46020 

gttgcaggtc acattgaacc acaccattgc actccaggcc tgggcaacac agcaatactc 46080 

tgtctcaaaa aaaaaaaaaa ttaaattaaa aaagacagaa tatttgagag aaaaaaatgc 4614 0 

ttatttcaag aaacatgaaa gataaatcaa gatattctaa ttcccaagta agaataattc 46200 

cagaagcaga aaatagaata gaggcaagga aacactcaaa acttctccag tgccatagaa 46260 

atgtgtatta atctttagaa tgaaacggac taccaaatgc tgagcaggaa gaacaaaaga 46320 

gatccactct taagccagtg tggtgcccaa gcgcagtggc tcatgcctgt aatcccagca 46380 

ctttgggagg ccgaggcagg tggatcacct gaggtcagga gtttgagatc agtcaggcca 46440 

acatggtgaa accctgtctg tactaaaaat acaaacatta gctgggtatg gtggtgcaca 46500 

tctgtaatcc caactacttg ggaggctaag gcaggagaat cacttgaaac caggaggtgg 46560 

aggttgtagt gagccgagat catgccacac tcccagcctg ggtgacagag caagattcca 46620 

tctcaaaaaa aaaatccact cctagacaaa taatagttaa attttagaac accaaggaga 46680 

aagaaaaaaa attgtaaagc ttcagagaaa ataaacatta actacaaaga aacgagagtc 46740 

agacgcgtgc acttcttcct agataccagc agataaagca atatctccaa aattcagaag 46800 

gttttaacgt agaatcctat acccagtcaa gaatattcac atggaaaagt gaaataaaaa 46860 

acattgttta aacatgcaag ggttcagaaa gtttaccatt cacagaatcc ctgaaaacaa 46920 

aaccaaataa tcacttaagg actcattaag aaaacaaatg aaataaaagc accaatgatg 46980 

agtaaataat cagaaaaatt tacagtttac ctaaataact gtttatgcat aatgtatgaa 47040 

aacccaaaaa tttaatatgg gacagaatta aaatcatgat aagattcttt tttgctttac 47100 

tcatggagag ttcacataaa cagattatct tttaatagca agagaaaaaa atgtttagat 47160 

atgtgtgaaa aactaagggt accaaaacag tgcaaattca tttatcatca ggaaaatcca 47220 

aattaaaacc acagtatcca ccagaataac taaaaggtaa aagacagaaa ttaccaagag 47280 

ttggcaagaa tgtggagcaa ccacatatac ttctggggta aataagttgg tgcaaccggt 47340 

actgaaaact gtttgctagt atctactaaa accgagcaca tgcacagact acaaccaagc 47400 

agttccactc ccagatacac actcaacaga aatgcacaca ctcactcaac aaaagacgtg 47460 

tactagagtg ttcatgtact tactattcat aatagtccaa aaatgcaaac aaccaactgc 47520 

caatcaaagt caaatgtata tctatattag ggatatatac aatggcatat acacagcaat 47580 

gagaatgaaa tgaaccagct cggcacagtg gttcatgcct gtaatctcag cactttgggc 47640 

gggtaaggca ggcagatcac ttgaggtcag aaatttgaga ctagcctggc caacacggtt 47700 

aaaacctgtc cccactaaaa acacaaaaat tagccgggca tagtggttgc aggcctgtaa 47760 

ttccagctac tcgggaggct gggttgggag aatcgtttga acccgaaagc cggaggtcgc 4 7820 

agtgagcgga gatcgtgcca ctgcactcca gcctggacga tagagcaaga ctccgtctca 47880 

aaaaaggaaa tcaaaaatat aaaataagat gacaggaata atccgcaaaa gatcagtaat 4 7940 

caaaataaat ataaatgggc taaagctacc tattaaaaga caaagatttc acacccataa 48000 

ggatagctac tatcaaaaaa agagagagaa taacagatgt tagcaaggat gtatggaaac 4 8060 

tgaaattctc acgcattgct ggtgagaata taaaatggtt cagcctctgc ggaaaacact 48120 

atgctgggtc atcaaaaaat taaaaataga agtactactt gatccaacaa ttctacttct 48180 

gggtatatac ccaaataact gaaagcaggg tcttgaagag atatttgtac acccatgatc 4 8240 

atggcagcat tattcataat agctatgatg tggaaccaac ataaatatcc tttgataaat 48300 
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atatggataa 
aagaaaattc 
ataagccagt 
gtacctaaaa 
ggtaatggat 
acaatgtgca 
taaaaataat 
attaactaat 
cacagaaaat 
caagtacagc 
tcccaaaagg 
cagcttcaaa 
acccctacaa 
agagaaaaaa 
taatatagaa 
gactgcaaat 
atcctaaaaa 
actctgaaat 
aggcattttc 
atgtacttga 
ggaaagtgaa 
caaaaacaat 
aatgtgaggt 
aataaatgag 
aagtagaaag 
atcaataaat 
agcaagtctg 
aacatacaga 
aacattatgc 
aatattaaat 
atgaaaaggt 
agtctgcagt 
agaaaatgat 
cacagataaa 
gctagcttat 
aaagaaaatc 
tgatcatatc 
gactttttaa 
actcggtgaa 
ttaacctcaa 
aaacatgaga 
cgaatcaagc 
ctgtagtttg 
tttctgttga 
actgaaataa 
tgaagggctg 
agttgctggt 
aagagcactt 
cgtgctggca 
aaaaaaaaaa 
gtcgttactt 
tgcccccggg 
ccctagccct 
gcgaagggct 
cgggaggctc 
ggcgagggct 
cgggtccctg 
ctgcggcacc 
ctggggacag 
tcccagaaag 



gcaaaatgtg 
tgacacatgc 
tataaaaaga 
taggcaaatt 
acagagcttc 
cacacttaac 
aaataataaa 
taaacaaaat 
tgaaaatcag 
aatataaaga 
tacaattcac 
aatacaacat 
gaatcataat 
aataagttaa 
ctgtataccc 
taatctgttc 
ccttccagct 
caatagaaga 
agatattaca 
gggagaagaa 
gaaatggtaa 
agcaataatg 
aagtggaatg 
ccaggttttc 
ataaaaaaca 
cctggttaat 
atcaaaaaaa 
tacatacaga 
tagcatatat 
ttactttgaa 
aattaaatac 
caagttctgc 
ggaaagtttc 
aatggggtaa 
tgatgtgaac 
tagcaatgtg 
aatgcatgct 
actcttagta 
gatacagagg 
tagtcaacac 
gacatctgtt 
gaaaaactat 
ggctaccaaa 
ggtattagga 
cacggtgttt 
acttgtgccc 
ggataagcct 
tcacaggaaa 
aatggagctc 
aaaagcaagc 
ccgcacgccc 
ccttccaggt 
tcccctgtca 
tctcggtcct 
ccgagaccca 
ggggtcgcct 
cgacaaccct 
ggtcccaccg 
gacacgcgac 
gcagtcccgt 



gtgtatacat 
tacaacatgg 
caaatactat 
catagagaca 
aattttgtaa 
actggggaac 
ttttatgtta 
ccagccataa 
tgactagaaa 
gaatgaacaa 
caagaagata 
ttaaagaaaa 
gggagtcttc 
ggatgcagaa 
aatatactaa 
ttaatctttg 
acaaaacatc 
cacatggtga 
aaaacagaaa 
aaatgttcca 
acaggtagat 
tctcgttgga 
aaagaattag 
cattcaaaca 
gaaattaatg 
aaaagctggt 
aagagaaaag 
tatgtaagag 
taaatttcaa 
gaaacagaaa 
tgatattaac 
caaacttgag 
ccaatttaat 
actatatgcc 
aatccaaaag 
accaccactt 
acacaaaagc 
attaggcata 
gaatgctccc 
tgcagcgaga 
gtttaacaga 
taaaactgag 
ttcaactcgc 
ttgacaagcc 
ggaactggat 
ttactcaaaa 
gccaaccagc 
atttttttcc 
cagcaaaata 
acataacact 
gcaggtccgc 
gcactgcgcc 
ccccggccag 
ctgcaccacg 
ggagccgggg 
ccagggccgc 
cgggcccgga 
ctccgggccg 
gaggggaccg 
gcccccacga 
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tcaatggaat 
atgaaccttg 
atgaggtact 
aaaagcagaa 
gatgaaaaaa 
tgtaaactta 
ttttaccaca 
gctaatggta 
aagatattcc 
aaaaaaaatt 
caagaattgt 
atatatatta 
aatacaactc 
aacctgaatt 
gagttcaggg 
tttttctttc 
tttttaaaaa 
aaccaaaatt 
attgatcatt 
aagaaaagta 
aaagctaata 
agggttgaag 
aagtccttgc 
gttaaaactt 
tcatagaaaa 
tctttgaaag 
gtaccaaaaa 
tctgttttct 
taatgttaat 
aactgagaaa 
tgcctaaaca 
ggaacagata 
cagagaggac 
aaactcagat 
tgcattttaa 
atgttaacaa 
atttgggcaa 
aacagaaatg 
taaaaccaag 
gtaatctatg 
caataagatc 
acaggcttta 
ttgcttggag 
tgtgctcctc 
aacagaatct 
aacactttat 
tcggcgtaat 
gaactgtatg 
agatattcag 
aatttccttg 
accaccggga 
gcggcgcccc 
gaaggggcgg 
cagcaccccc 
ccgggcgtgc 
agctgtcggg 
ggggaggagg 
ggcaggacag 
gggcccccgc 
cggactgccg 



attaattagc 
agggcattac 
atattagata 
tggtggttgc 
ttctggagat 
aaagtagtaa 
atatttatta 
agagtaacaa 
atataaatgc 
aaataagatg 
gaacctttaa 
aacatagaaa 
tccatatcaa 
accatcaata 
aacagtcgtg 
agcactgtgg 
tataaaaaaa 
ctagaataca 
gctgaagtaa 
tctgtgatac 
aatgttgacc 
taaaaataca 
cttgttcaca 
gaacaaaata 
ataaaaaatc 
gattaataaa 
aagtactgta 
tacaccagaa 
gattttctag 
aataaatgat 
acaccagcag 
attcttctat 
agcctgatcc 
accaaaaccc 
attagcccag 
ttttaagacg 
aaaacccaac 
tacttaatgt 
cccaagacaa 
gaagacaagg 
acctacttgg 
gtatggaggc 
agttaatcct 
cctcctcccc 
tccaaaaaca 
ctgctgcctg 
tcttcctgca 
ccgcttatta 
agtcaaactt 
catgggcact 
aacccacggg 
agctgacccg 
gagcgcggcg 
aaggcacaac 
ccgcgcacct 
agccacctgg 
cggccacctg 
gccaggacgt 
ggcgaagacg 
gacccccgcg 



aataaaaatg 
attaaatgaa 
ctcatgcaag 
caggggctgc 
tggttgcata 
atggtaaaaa 
aaagacaaag 
ttaaagaaga 
taacaaaaag 
gctcgtttat 
gcacataaaa 
tagtacaaaa 
caggtcaaac 
aacttgagat 
actgacagtg 
cagaatagag 
tacaaaaata 
gggagaataa 
tttctaaaga 
aagaaggaat 
tagaaaataa 
attaaggcca 
ggactgatta 
aactcaaatt 
aatagaatta 
ataatcatta 
tcagaaagag 
tactatatac 
gaaaacagaa 
catgaaaaaa 
cagcccaggc 
tccagagcat 
ttgttatgaa 
taaataagat 
ggttttagag 
aaaatctaca 
acccaccctt 
gatagaatac 
agattcctat 
aaaaaagtaa 
aagaggcaaa 
tcagcttcag 
gcaaagctaa 
catcttcaac 
aaaattgtcc 
cagctcctac 
gagggcaagg 
cataaactta 
ccttaggaaa 
ggggaaggag 
caccgcgcgc 
ggatgcgcag 
gacgccgagg 
agggagggtg 
gtcccactgc 
ctctcagtcc 
ccgctgccac 
ccctcctggg 
cagcacgcct 
ctcgcccgcc 
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51900 
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catcccttca gaccacgcgg ctgaggcgca aagagccggc cggcgggcgg gctggcggcg 51960 

cggctagtac tcaccggccc cgctggctca gcgccgccgc aacccccagc ggccacggct 52020 

ccgggcgctc actgatgctc aggagaggga cccgcgctcc gccggcgcct ccagccatcg 52 080 

ccgccagggg gcgagcgcga gccgcgcggg gctcgctggg agatgtagta cccggaccgc 5214 0 

cgcctgcgcc gtcctccttc agccggcggc cgggggcccc ctctctccca gctctcagtg 52200 

tctcatctcc ctatctgctc atcctctggt cgcacataat cgatgtttgg gcgtcccaag 52260 

ccagatgtgg accccatttc cgcactctac actggaggtt ttctaagggt ggtgcccgga 5232 0 

ccagcagctt cagcctcatc tgggaacttg agaaaatgca gattctccgt cccacccagc 523 80 

ctattcggtt tttcctgcac taaaaccatg aaggtggggc ccagcagtcc acattctcgc 5244 0 

aagcccgtca agtgattctg aggcgccctc cagtttgaga gctatgctca cggcctcacc 52500 

tccgccccgc aaggagcccg gtcttgcctg tggcgctagc cgcacacgga cacctcatcc 525 60 

tgcggggccc gcccccccgc tgcaccctca ccgcccaacg cctcctccgg gatgcagcgg 52620 

aggcgcctgg aagtcggcaa ggtcaacatc cccctcagca tcttccctac cctcacggct 52680 

cctcctccag gggtgcctca tggccagggg ttagaaagag ccactgtgtt tcttgacatg 52740 

gaagtggcct aagaccttaa tgaaaactgc aggagtggaa tgacagaacc tttggtcata 52800 

cttgagggcg tgaagctcaa atgaggagga aggaaaggat ccagggagaa taaccaaccc 52 860 

tggcaagttg tggcgcccag gtagaggggc gagcctaggc tagcggttct cgaccagggc 52920 

cggtgttgcc cctcctcgcc gccccgcgta catttgggga ggtctggaga catttttggt 52980 

tgtcatgatg cgggagttgc tactgttgcc taagtgggta gacacgaggg tgctcctcaa 53040 

catcctacct gaaggacagg actgccccac aaggaagaat gatccggccc caaataagaa 53100 

accctgggct ggtcagcaac aacccctttg ttctgagaag agaggaggaa agaataaaag 53160 

aagtggggtg aagttttggt ttggtagagg aaacttgaag acattttcac tggaaaggaa 5 3220 

gagaggaaga ggagggagat gtctgtaagg acgagcaaac cgggtgacag ctgatttcct 53280 

catattgaag taatgagtcc tagttataat aaattcctaa taaaaaccca gtttatccct 53340 

gcaataaact tgtctttttt ttttaaatat actgcttgat tctgtttgct aatattttat 53400 

ttacaggctt tgcattgata tgcaaaaatg agatgggcaa taattttctt tttgaatgtc 534 60 

taatgttgtt tggtttcaga atcaatgtta tgctcacatc ataaaaaatt tggaaccgag 53520 

gcaggaggag tgcttgaggc cagaagttcg agaccagtct aggaaacaca gtgagacccc 53580 

cccatctcta caaaaaaaaa aaaagaaaaa aaaatgggca tgtttgcttt ttccttttac 53640 

tctgaacaat ttaaggagca ttaaaattat ctattctttg aggtttgatc atttcccagt 53700 

taaaaatgtt cctcccagcc tgatgctttc tttggggagg gtaaatcttt taaggctaga 53760 

aaagtttctt ctgtggcaat tttattattt acattttaaa aattattcta gagttaattt 53820 

tgataaagca tgtatttctt aaaacaaatt atcctttttt tccagatgtt caagtgtatt 53880 

tgcataaagt tgaggaaagt agtcttttgt gaatctttta acttctccca aatatcttat 53940 

tttgtgtatt tttgcttctt tattttgtta acttttaaaa gtgtattttt ttttcaaaga 54000 

atcagctctt aggtttatgt ttttggttat actggagctt ttttcttctt ctttttaaaa 54060 

tattttttct cctttatttt ttagacgtat tttgatctaa cgtaatcgga agaaggtaaa 54120 

ttagaatctt ttgttactat tgtgttttta tttctcctta tttctctgaa gtcctgcttt 54180 

ataaatagta ccatgttatt tgtgcataaa tattcatttg tcttatattc ttgggaattt 54240 

tcccacttca tcataaaatg accttccttg tctcatttaa tgtgttcaaa ctttgccctg 54300 

aatttaactt tgtctgatat tttaccatcc tgctgaattt tgtttgttac cccaaacaac 54360 

ctttgctgtt ttcgtctttt ctgaaccctt tattttaggt aatcccttga attagagcac 54420 

taagttttgc tttgtgatta aatctgaaaa tctttatctt gccatagatg agttgagccc 544 80 

tattcatgtg acagctatat tatgctgttt catagccctt ttggtccttt tttcactctt 54540 

gcattgcata ttttgtgttt attgtgtttt gtgtttcttc tgataatttg gaaggtttgt 54600 

atttttattc agggagttgc cttataatca tactccgcaa tacacatcgt cctcagtttc 54660 

ttcagactgt ctgttaactc cctattctga ataaaaatga cattgtaatt tccctctttt 54720 

ttctttaccc cttttcttct cctcacctaa tgtaaatgat tttatccttc tttagtattt 54780 

gcttttttaa ttaactacat ttataaatat ctttatcact tgatttttaa atcagctttg 54 84 0 

aatgagatat ttggattcct agatataaaa gatgttaatt ataccatttc cacgttagta 54900 

ggtttataaa atcatacatt ctgctgtgta accataatcc cacgtttgtt ttagttccac 54960 

tcctacagtt aaaagattca gaagtattat taacagttat tttgccatag ttttttcccc 55020 

aacccatttt gtggtaagtt atgatcctgc tttagtttct taagaataat ttatagagca 55080 

gagtgtggtg gctcacgttt gtaatcccag cactttggga gacaagaggt agaaggatcg 55140 

cttgaagcca gcagttcaag accaccctga gcaacatagt gagaccttgt ctctacaaaa 55200 

aattttaaaa tttagccaga cgtagtggcg tgtgcctata gtcccagcta ctcaggaggc 55260 

tgaggcaaga ggattgctag agcccagaag tttgaggctg cagtgacctc tgattgtgcc 55320 

actgcacccc agtctgggea agaaagtgag aacctatctc tttaaaataa caataataac 55380 

ttatgaaaat tatattccct gagtttttca tgtttaaaaa tatttgttgc ctttatcctg 5544 0 

taaaagtttg agtataaatt cttgggttat actttattta ttgaagaatg tataagtatt 55500 
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gtcttctaga attgagtgtt gctgtaatga aaccagaagt cagcctggtt tatttttcct 55560 

cagaaatgag gtaattgccg gccggacacc gtggctcatg cctgtaatcc caacactttg 55620 

ggaggccgag acaggtggat cacgaggtca ggagattgag accatcctgg ctaacatggt 55680 

gaaaccccgg ctctactaaa agtacaaaaa gttagctggg catggtggtg gacgcctgta 55740 

atcccagcta cccgggaggc tgaggcagga gaatggcgtg aacctgggag gaggagcttg 55800 

cagagagctg agatcgcgcc actgcactcc agcctgggcg acagagtgag actccgtctc 55860 

aaaaaaacaa aaaaaaaaca aagaagtgaa gtaattgcca tgatgctcca agaattatct 55920 

ctttgtctat gaaatccaga aatctcactg ttatacattt tggaattatt attctgggcc 55980 

aatatttcct gggacacaat agattgactc tatagattta attttttttt tttttttgag 56040 

acagagtctc actgcaatct cagcttactg caacctctgc ctcacgggtt caagcaattc 56100 

tcctgcctca gcctcccaag tagctgggac tacaggcgcg tggcaccatg cctggctaat 56160 

ttttgtcttt ttagtagaga cagggtttca ccatgttggc caggctggtc ttgaacgcct 56220 

aacctcaagt gatccacctg cctcagcctc ccaaagtgct gggattacag gcgtgagcca 56280 

ccatgcccag cctcaattcc tctttctatc tggtaatttt tctgaagttg aaaacatttg 56340 

ttctaatacg ttatttcagt gttcttctaa gatgtgtaaa gcaccctatt cccaggtcag 56400 

cccccatctt gctagtgagc tcggctggtt cttcacaaga gctctggttt tctcctgctt 56460 

aatctcaagt acctctgtca gcctccacct ggtttatgat ttggagtttt ttggtttttg 56520 

ttttttgttt ttgacagagt cttactctgt cacccaggct ggagagcagt ggcataatct 56580 

cagctcactg caacctctgt ctcccaggtt tgagcgattc tcctgcctca gcctactgag 5664 0 

tagctgqgat tacaggcgcg tgccaccaca cccggctaat ttttgtattt ttagtagaga 56700 

tggggtttca ccatgttggc cagggtggtc ttgaactcct gacctcaggt aatccacctg 56760 

cctcagcctc ccaaagtgct gagattacag gcgtgagcca ccgcgcctgg catggtttgg 56820 

agttttaatc tgtagtttta ataaagatag tgcttatgtt tgtgtttctt atatttcttg 56880 

gtactcttgg gtaatttgta agatccccat atctacacaa gaagtccatt ttcaattctt 5694 0 

ttcttcagac tgtttatttt attttatttt attttatttt tatgtttgag atggagtctc 57000 

gctgtgtcac ttctggaggc tggagtgcag tggcgcgatc tcaggtcact gcaacctccg 57060 

tctcccgggt tcaagcaatt ctcctgcctc agcctcccga gtagctggga ttacaggcac 57120 

ctgccacttt ttaatttttt tagagacaga gtctcgcttt gttgaccagg ctggagtgcg 57180 

gtggtgcaat catggctgac tataacctcc aaatcctggg ctcaagtgat cctcctgcct 5724 0 

cagcctcctg agtagctggg actacaggca catgccacca tgcccagtta attttaattt 57300 

ttttgtagag acagggtctc catatgttgc ccaggctggc ctcctactcc tggcctcaag 57360 

taatcctcct acctcagcct cccaaattac taggattata agcatgagcc accatgccca 57420 

gccttgttct actactttaa tttcatatgt taggtgacca tgtaattgat catccaaacc 574 80 

aggatactgt aagaatgaaa gaggctgaca gtagtatgat gctgggacta gcattgtgca 57540 

ctgagattat ttctgggaaa gcaggagata cggtcaccct acttatagtg tgcttgtctt 57600 

tggattgttg aatttggagt ttctatttgc aggcttattt caactgggca gccttgatcc 57660 

gccctgccca gcaatgctac cgttctctcc accgggtctc tgggacccct tcagtcacta 57720 

tacttagctc agttccccac cctcccactc cctaaaagcg taaccaggaa tcctgcctca 57780 

ggtctactgc cgtcttccgt gggctgtttc agttcctatt acccagagtc aaactcccag 57840 

cattccctac ctgattccag acttggagtc cagagcttta acctcttcag gccaactccc 57 900 

cactttgcat ttctgtccct atatcttagt ccatggagat acatttcatg tctttgagtc 57960 

tacttacaaa gtaaattttg ctgtttttta attttttttt tgagatggag tcttgccctg 58020 

tcacccaggc tgtggtgcaa tgacgccatc tcggctcact gcaacctccg cctcctgggt 58080 

tcaagcgatt catctgcctc agcctcccaa gtagctgtga ttacagacag gcaccaccac 58140 

gcccagctaa ttttttttat cttttagtag agacagggtt tcaccatgtt ggccaggctg 58200 

gtcttgaatt cctgacctcg tgatctgccc atctcggcct cccaaagtgc tgagattaca 58260 

ggcgtgagcc actgtgccca gccaattttg ctttttttat atttcattgc tatatgttta 58320 

gaggataagt ttacagtgct atatgcattc ccaaatatta gaccaaaaaa atctccaaaa 58380 

aattagaaag aaaatccaaa aaatctcaaa aaataccaaa aagcaacaat ctcacagacc 58440 

atactcactg acccccaata aaataaaatt agaaattaac cacaacttaa caaaataaag 58500 

tactcaagtc agagaggaaa gaggaaataa acatcaaaat tacaaagtct aggcggtggc 58560 

tcacgcctgt aatcccagca ctttgggagg ccaaggcggg cagatcacaa ggtcaggaat 58620 

tcgagaccag cctggccaat atggtgaaac cccgtttcca ctaaaaatac aaaaattagc 58680 

caggcatagt gatgtgtgcc tgtaatccag ccacttggga ggctgaggca ggagaatcac 58740 

tgaacccagg gagacgaaga ttgcagtgag ccaaaatcgt gccactgcac ttcggcctgg 58800 

gtgacaaagc gagactccat ctcaaaaaaa aaaaaattac aaactcttta gatagaaatt 58860 

ttggtgtttt tttttgagac ggagtctcac tctgtcgcag aggctggagt gcagtgggac 58920 

tatgtcagct caccgcaacc tccatctcct ggattcaagc aattctcctg tctcagcctc 58980 

ccaagtagct aggattacag gcgcccacca ccagacccag ctagttttta tatttttagt 59040 

agagatggtg tttcaccatg ttggccaggc tggtctcaaa ctcctgacct caagtgatcc 59100 
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acctgcttca gcctcccaaa gtgctcagat tacaggcgtg agccaccgca ccccacctag 5 916 0 

atagaaattt caacatgagg ccgggcacaa tggctcacgc ctgtaatctc agcacttcag 59220 

gaggctgagg cgtgggagga tcacttgggc ccaggagttc aggaccagca tgggtgacag 59280 

agacagaccc tgtctctatt tatttgaaaa aaaaaaaaaa aaagagagag agaaagaaat 5 9340 

ttcaacatga aaagtatctc tcaaaccctt cgagatgttg gcaaaaagcg actcaaagga 594 00 

aaatgtatta ctgtgtgtga atttgcttga aaataagaaa gaggccgggt gtggtggcta 5 94 60 

acacctgtaa tcccaacact ctgggagtcc gaatcaagtg gatcatgagg tcaggagatc 59520 

gagaccatcc tggctaacat ggtgaaaccc tgtctctact aaaaatacaa aaaattagct 5 9580 

aggcgcggtg gctcatgcct gtaatcccag cactttggga ggctgaggca ggtggatcac 5 9640 

ctgaggtcag gggtttgaga ccagcctggc ctacatggtg aaacctcgtc tcttctacaa 5 9700 

atacaaaaat tagctgggcg tggtggtggg tgcctgtaat cccagctact cagaggctga 5 9760 

ggcaggagaa tcgcttgaac ccgggaggcg gaggttgcgg tgagccgaga tcgcaccact 59820 

acactccagc ctgggcaaca gcctgggtga cacagtgaga ctccatctca aaaaatacaa 59880 

aaaattagct gggtgtggtg gcctgcgcct gtagtcccag ctacccggga ggctgaggca 59940 

ggagaatgga gtgaacctgg gaggaggagc ttgcagtgag ccgagatccc accactgcac 60000 

tccagcctgg gcgacagagc aagactcttg tctcaaaaaa aagaaaaaaa aaggaaaaaa 60060 

gaaccctgat aataaagaaa ccaaatgttc aactctcaaa gctcggacac tttaaagaaa 60120 

taattaataa aggcagaagt taaagggagg atgataaagc aatttttttt gttggttttt 60180 

ttgagatgga gtcttgctct gtcacccagg ctggagtgca gtgatgcgat cttggctcac 6024 0 

tgcaacctct gcctcccggg ttcaagcaat tctcctgcct cagcctcctg agtagctggt 60300 

actacaggtg cgcgccacct ggcccagcta atttttgtat ttttattaga gacggggttt 60360 

caccatattt gttaggctgg tctcaaactc ctgatctcag gtaatctgcc cacctcggcc 60420 

tctcaaagtg ctgggattac aggcaggcgc caccgcgcct ggcctaaagc aaaatattgg 60480 

ttctgtgcaa aaggtcaata aaaagagcaa acgtttacaa actggagcca gcacccattc 6054 0 

agctcagtgt gtctggagaa aaaacaatct cgcttcagaa ttcatgatta cgcagccctt 60600 

tttgcttcct aaaaatccta ctatgttgct gttgaccatt ctctctcttt ctctctctct 60660 

tgccttctct ccagaaaagc tattcagaca ttctcctctt tcctcaaacc tccaacactt 60720 

cctcctccat ccttagcctc agctgctgac ctcacttcta atcattgaga aaccaggaga 60780 

agcatttaag agtgaacctc cgcctccccg cacgggcaaa accacccacc cacagaattg 60840 

tgccccaatt ctgcgtcctc tcctctcacc atggatggac ggtccaggct ccgagccaaa 60900 

gccaggcctc ccctggagct ctggatccac cacctgcagc ttctcaggca gggccccagc 60 960 

agctcccctg ctcccttgta ccatcaatcc ctcccctcac tgggtcactc ccaacaatat 61020 

atatatttag tgatgtttct cccatgtggt aaaatcactt agcctctctc ctcccccagc 61080 

tactatccta tttgtttctt tccattctct gcaaaacttc tcaaagcatt gtgtctatgt 61140 

gctgactcca tttatcttct cccgttctct gctgagtcct tcccacagac tctcacccca 612 00 

gttactccat gaaatgacct ctgcactgcc acatccaatg gtgaatgttc agttcttaat 61260 

tttattcagt ctttcagcag catttgacct ggccgatcac tccctcttct taaaaatact 61320 

tttctcagcc aggcgtgatg gctcacacct gtaatcccaa cactttggga ggccaaggcg 613 80 

ggaggatcat gagagcccag gagttcaaga tcagcctggg caacatggca agaccctatc 6144 0 

tctacaaaaa ctaaaaagta gccagtgtga tggcatgcac ctgtagtccc atctacttag 61500 

gaggctgagg cagtaggatg acttgagcct gggaaatcaa ggctgcagtg agccatgatt 61560 

gcaccactgc actccagcct gagtgacagc gagaccctgt ctcaaaaaga caaaatagga 61620 

aacttttctc agcatattcc tctgattctc ctgctgcttc tgtctgcaca gattcagtct 61680 

cctttgccgg ttcttcctca tcctcctgat ctcttgacct tgaagtgccc cagagtacag 61740 

tctttttttt tttttttgag acgcagtctc gtctgtcacc caagctggag tgcaatggcg 61800 

aggtctcagc tcatgcaacc tctgcctcct gggttcaagc gattctcctg cctcagcctc 61860 

ccaagtagcc aggactacag gcacatgcca ccatgcccag caaattgttg tatttttagt 61920 

agagacaggg ttttactata ttggccacgc tggtctcaaa ctcctgaact cgtgaaccac 61980 

ccgcctcggc ctcccaaagt gctgagatta caggcatgag ccaccacacc cggcccagag 6204 0 

tacagtcttt agacggcctc tctacctata cttgctcccc tcataaactc ctcctgcctc 62100 

atggctttaa ataccatcgg tagactgatg actcccatat ttctcttttt tttttggaga 62160 

cggagtctcg ctcagtcccc caggctggag tgcagtggcg cgatctcggc tcactgcaag 62220 

ctccacctgc caagttcaca ccattctcct acctcagcct ctccagtagc tgggactaca 62280 

ggcacccgcc accacgcctg gctaattttt ttgtattttt agtagagatg gggtttcacc 62340 

atgttagcca ggatggtctc gatctcctga cctcgtgatc cgcccatctc ggcctcccaa 624 00 

agtgctggga ttataggtgt gagccaccgt gcccagccga tgactcccat atttctatct 62460 

cttgctgtgt gggagttctc ctcagaactc catactcata aatccaactc tcataaatag 62520 

tatctcaaat gggcaatatg ctcaaaagtc aattcctact tttctcccta aacttgcttt 62580 

cctgcagtct ccaccatctt aatgtccaat ctaacattag gaggcaaaaa ctttgaagtc 62640 

attcttgact cttctctatt acacacccta tccaatcttt ctgcagatcc agtcgacccc 62700 
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caaatccagt tagctctcat catctcccct gttaccccct ggtccaggcc atcttcctct 62760 

ctcacctgaa tcactgcagc attctcctca ctggtctctt tggttctgtt ttcactccac 62820 

cttagcatag tctccacaga gcagtcagag ggatcctttt aaagtgtaat tcccatcctg 62880 

tccctgctct gctcaaaacc ctgtcgtgat tcccgtttta atctgtcaga ttaaaagcca 62 94 0 

gagtctttcc agtgacctac atgatctgcc tattatcacc tcccacttct ttccccttgc 63000 

tcactccact ccagctctgc agctgtcctt tctgtttcct gaacagccca gattttgctt 63060 

ctttagaacc tttgtatttg ctgtcccctc tgtctggaat gtttttccag gaagtcacct 6312 0 

ggctctctcc tgcacttcct tcctgaccac catgtttaaa aatcactcaa acacacttca 63180 

ggccggacat ggtggctcac gcctgtaatc ccagcacttt gggaggccaa ggtgggtgga 6 324 0 

tcacctgagg tcaggagttc gagaccagcc tggccaacat ggtgaaactt cgtctctact 63300 

acaaatacaa atagtagcca ggtgtagtgg cacacacctg taatctcagc tactcaggag 6 3360 

gctgaggcag gagaatcgct tgaacccaga aggcagagga ggtgcagtga gccaagatca 63420 

cgccacaaca ccccagcctg ggtgacagag caagacccca tctcaaaaaa aaaaaaagaa 63480 

aaaaaaatca cacaaacaca cttctcttca tattcctttt ccaagtttta tttttctcca 6354 0 

gaatacttta cattgtttta atggaagttc tccgtttccc cccaactaga atggatactt 6360 0 

cctgcaggta ggcactctag tcctcccatc caagtactaa ccaggctcaa ccctgcttag 63660 

cttctgagag caggggagat caggcctgtt cagggtggta tggcccagga attttgattc 63720 

tgttttattc attgctgttc tgttgattct cttttgttcc tcctcctagt gctgagaaca 63780 

ctacttgtac ataataagca ttcaataaat atttgttgaa tgaatgactt gttgaatgaa 6384 0 

ttaatctcag aaatgcagga ctggttctac attagaaaat ttttcaaggt cattctctgt 63900 

tgtcgtaaca cattaagaga ggaaaatttt gtactctaaa tcatttgata aaatacatac 63960 

tgatttctgt tttcaaaaac tcttagtggc tgggcgaggt ggctcacatc tataatccca 6402 0 

gcattttggg aggacgaggt gggcggatca cttgaggtca ggagtttgag accagcctgg 64080 

ccatcatggt gaaaccctat ctctactgaa aatagaaaaa ttagccgggt gtggtggcgc 64140 

atgcctgtag tcccagctac ctgggaggct gaggcaggag aatggcttga acccgggagg 64200 

cggaggttgc agtgagccaa gatcatgcca ttgcactcca gcctgggtaa cagagtgaga 64260 

ctccatctca aaagaaaact cttagtgagt t taggaat cc aaggaagacc ctcaaactaa 64320 

atagataatc tagctaccag aagccttcag taaaccttaa cactccatgg tgaaacatta 64380 

gaaacattcc tactaaaaga caggctaaga atgcctgcaa tcttcacggc tagtccaaga 64440 

agtcaaaaag aagaaatgag cgctgattta aaaaaataaa caaacaaaaa actaccgatg 64500 

cagaggctgg cagcaaggac tgaaggactg tacagtactt gcctggagca ggcggatggc 6456 0 

cacacccctg cgaagcctgc tcagctggct gggggacgct ccagtgtgtg agtggcagga 64 62 0 

tgcagggtac ttcctctgcc agggagttgc actggggaga tcctccccca ctcacacttt 64680 

ggcagctggg gctttggaat gtgacttagc ttctgtcaaa gggtcaatcc accctttgat 64 74 0 

atatgatgca aaggcgaaca tatgatgcaa aggtgagaga acagcccaaa ttaggafcrttt 64 800 

taccacagct gtggaggtgg acagcgacag tggtgggccc tggccagact tttcatgctc 64 860 

aaaggtggtg gttgttcttc ctacttcttg tccctccagg gcttcctttg cctgtgtgct 64920 

gaacctgctt cttttaattt tttttaactt ttttaaattt ttaactgttt taattaaaac 64980 

aaattttgaa aactgtctga acctgctttt gaaccctgct atgatttgaa tgtttgtccc 6504 0 

ctgccaaact gattttgaaa cttaatctcc aaagtggcaa tattgagatg gggctttaag 65100 

cagtgactgg atcatgagag ctctgacctc atgagtggat taatggatta atgagttgtc 65160 

atgggagtgg catcagtggc tttataagag gaagaattaa gacctgagct agcatggtcg 65220 

ccccttcacc atttgatatc ttacactgcc taggggctct gcagagagtc cccaccaaca 65280 

agaaggctct caccagatac agctcctcaa ccttgtactt ctcagcctct gtaactgtaa 65340 

gaaataaatg ccttttcttt atgaattacc cagtttcaga tattctgtta taaacaatag 65400 

aaaacgaact aaggcaaact ctcatgattc tactgccatg ccattccaat aaactccctt 65460 

tatgcttaag agagccagag ttggccaggc gtggtgactc acgcctgtaa ttccagcact 65520 

ttgggaggcc gaggcaggtg gatcacaagg tcaggagatc gagaccatcc tggctaacac 65580 

ggtgaaaccc cgtctctact aaaaatacaa aaaaattagc tgggcgtggt agtgggtgcc 6564 0 

tgtagtccca gctactcggg aggctgaagc aggaggagaa tggcgtggac ccaggaggcg 65700 

gagcttgcag tgagtcgaga tcgtgccact gcactccagc ctgggtgaca gaatgagact 65760 

ccgtctcaaa aaaaaagaga gccagagttt atttctgttg cttgcaacca agaaatctgg 65820 

ctggtgcact gaagtttcca taaataatag caatttaaag actctttcca agccaggcaa 65880 

tgcctagcct tgtgtagtcc ttgtggtaat acattcattc attcatttgt tcaaccaact 6594 0 

gtgctccaga gactaagaat acaaaaatgg gggccgggtg tggtggctca cacctataat 66000 

cctagcactt tgggaggccg aggcaggtag atcacctgag gtcaggagtt cgagaccaac 66060 

ctggccaaaa tggtgaaacc cctactctac taaaaataca aaaaattagc tgggggtggt 66120 

ggcggacacc tgtaatccca gctactcgtg agactgaggc aggagaatca cttgaacccg 66180 

ggaggcagag gttgcagtga gccgagatcg caccactgca ctccagcctg ggcaacaaga 66240 

gcgaaactcc acctcgaaaa aaaaaaaaaa aaaaaaagag ggccggggct gggcgcagtg 66300 
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gctcacgcct gtaatcccag cactctggga ggccaaggca ggagaattac gaggtcagca 66 360 

gatcgagacc agcctgacca acatggtgaa accccatctc tactaaaaat acaaaaatta 664 2 0 

tccgggcgtg gtggcgcaca cctctagtcc cagctacttg ggaggctgag gcaggagaat 66480 

cgcttgaacc cgggaggcag aggttgcagt gagccgaaat catgccactg cactccagcc 66540 

tgggtgacag agtgagactc cgtctcaaaa aaaaaataaa aaaaaaaaaa gaattcaaaa 66600 

attgtagagt tatagtgtgc ttctagttta gttgagagga catctgtcct tcaaggaagg 66660 

ctagaatcta taccctgagt ccttactgaa atcaatccag cagtcaaaac atgggaccaa 66720 

cgatcacagc agtaagatag gaagagcacc tttgtacatt tagctcatgt tgagataagc 667B0 

cactgacaga gctgaaggaa gctcacagtt ctgggttcca tcctttggca tttaaaaaga 66840 

aaagtgctaa gaaaattcgg ttggtcacgg tggctcacgc ctgtaatccc aacactttga 66900 

gaggccaagg caggcagatc acgaggtcag gagttcgaaa ccagcctggc caacatggtg 66960 

aaaccccgcc tctactaaaa acagaaaaat tagccgggca tggtggcgca tgcctataat 67020 

cccagctact caggaggctg aggcaggaga attgcttgaa cccgggaggg ggaggttgca 67080 

gcgagtgaga gcaggccact gcactccagc ctgggagaca gagcaagact ctgtctcaaa 67140 

aaaaaaaaag aaaaaaagaa agaaaggaaa aaaagaaaga aaaaaaaaga aaaaagaaaa 67200 

ttcaggccag gccaggcctg gtggctcaca cctgtaatcc caacactttg ggaggctgaa 67260 

gcgagacggt gccttagccc aggagtttga gaccagcctg agcaacatag cgagaccctg 67320 

tctctataaa aaaaaatttt tttttggcca gacgcagtgg ctcacgcctg taatcccagc 67380 

actttgggag gccgaggcag gtggatcacg aggtcaggag atggagacca tcctggctaa 67440 

cacggtgaaa ccccatctct actaaaaaat acaaaaaatt aaccgggcgt ggtggcgggc 67500 

gcctgtagcc ccagctactc gggaggctga ggcaggagaa tggcgtgaac ccgggaggcg 67560 

gagcttgcag tgagccgaga ttgcgccact gcactccaga ctgggagaga gtgagactcc 67620 

gtctcaaaaa aaaaaaaaaa aaaaaaaaat taattgtcag gtgtgctggc atgcagctgt 67680 

agtcctagct actcgggagg ctgaggtaag aagatcgctt gagcccagga gttcaaggct 67740 

gcagtaatag tgcctctcac tctaccctgg gtgacaatga gaccctctct caaaaagaaa 67800 

gaaaaaaggg aaagaagaaa agaaagaaag aaagagaaga aaggaaggaa gaaagaaaga 67860 

aaaagaaaag gaaggaagga agaagaaaaa aaaagaaaga aagaaaagag agagaagttc 67920 

aaagaccaaa gggtcaggat cccaaaatag tttttatgtt ttatttattt atttacttat 67 980 

ttatttttga gacagtatgg ctctgtcgcc caggctggag tgcagtgatg cgattgcggc 6804 0 

tcactgcagc ctccaaactg ggctcaggtg gccctcccac ctcagcctcc cgagtagctg 68100 

ggaccacagg cgcgtgccac catgcccagc taatttttta attctttgta gagatgaggt 6 8160 

ctctatatgc tgcccaggct ggtctcgagc tcctgggctt aagccatcca cccgcctggg 68220 

cctcccaaag tgctgggatt acagaagtga gccaccgcgc ctaatcgggt ggtttgtttg 68280 

tttattgacg gggtctcgct gctgcccagg ctggagtgcc agtggctgtt cacaggtgca 68340 

gtcctggagc attgcatcag ctcttgggct ctagcgatcc tccagagtag ctgcagctgg 68400 

gattccaggc gcgccaccgc gcggggctca gaatgggttt ttatattgag ggttatgctg 68460 

ccacctagag gatatatgta gtaccgaact gtgtgcgcag ggaggctgag gttgcagtga 68520 

gccaagatga tgccagggca ctccagcgtg ggtgacagag caagatttca tctcaaaaaa 68580 

aaaaaaaaaa aaaaaaaaaa aagaattgaa agtaaggtct tgaagagata tttgtgcctg 68640 

tatggtcata gcagtattaa ctttgaccca ctagctaaaa cacaaaagca acatgtgtct 68700 

gtcagcaggt gaacggataa acaaaatgtg gtatatatgt acaattgaat attattcagc 68760 

ctttaaaaag gaataaaagg ctggatgcgg gggctcacgc ctgtaatcct aacactttgg 68820 

gagactgagg tgggtggatc acccgaggtt aggagtttga gaacagcctg gccaacatgg 6 8880 

tgaaacttca tctctactaa aaatactaaa attagccggg catggtggca cttgtctgta 68940 

atccaagcta ctggggaggc taaggcagga gaattgcttg aactcaggag ccggaggttg 69000 

cagtgagcta agatggcacc actgcactcc agcctgggca acagagtgag actccatctc 69060 

aaaacaaaca aacaaaaaat tattatttcc aaagaaacaa gaccctgggt ccatttccca 69120 

gcccacacct gatgttgact cacaacacac agcctggttt gctatgagcc tgcttcattt 69180 

aattgtcacc ttaacttcac atcaccctca agtcctggaa taactctttg ctgacctttg 69240 

tgtgctgagc catctccatg tcgctcaacg tgcagtccct ctcactgcac tgagtcaata 69300 

gccagacgtg gtctgactgc agggtcatcc ttggtggctt aggctgactc gggcatagca 69360 

gggtgctctg agacctcacc gcatataggc tttgccccca ataaactcta tataatattc 69420 

atattatgtg gtctgggtgt gtgtagcttt gcactgtctt ctcgtgacag tgccctcaac 69480 

ctctttccca ggatttcctc ctctacctcc tcaagtccca ctgctctgca aagaccaaaa 69540 

gctgcagagt cccagctccc tcctttacac cccacgacgc agcctcctct ctcagaaccc 69600 

tttaaacaga gtcttttact gcagatccca agaacagcca cacccctctc tcccacccac 69660 

tccagacaca cccaggtaat tatagcaccc agggtaacta tgtagatgga gtccctggaa 69720 

catgtggata gtgccccctg ggagtatgca aaagcaacat tgctggcacc tgcagagaac 69780 

agggtgacat ccaggaatca gagcatgggc ctctgggagg tagggatgtg gccaggcagg 69840 

ctgccaaaaa ttggtagagc aaggccacag gatctttctg accttccttc caaacagagg 69900 
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ctcctgtact ggtgatccct gtgttgattg accactccct tcctgggggt cgtggtctct 69960 

gtcccagttg cccggacttc tgtgagtgtc ctactgaggt ccttttcatg agaagcatgc 70020 

tgtccttcca cctgctggga gcaagagtga caacttcaat actataatag cagtggcata 70080 

cagagaagaa gaaagatgaa gtggcaagaa aaacaggctt ccaagcagga gtttttctat 70140 

aaaaacaaaa acgtttacaa gcaaactttt tataaagggc tagatagtaa atattttagg 70200 

ctttgagagc cacatagact tgtttgcagg gactcaatgt cgctattgta gtttgaaagc 70260 

agccatcagg gttatgtaaa tgagtgagtc tgattttgtt tcagcaaaat tttatttacc 70320 

aaaacagaca atgagtgggc tggatttggc ccatgatcct tagtttgcca actcctgctt 70380 

tgggctcacc cagatctgat tttgaattct ggctctgcta ctggttagct gcaggagctt 70440 

ggaaggctct ctgagcctgt ttcctcatct gtaaaattaa agcaataatt tctaacactc 70500 

aagagtgtta cctcacgcct gtaatcccag cactttggag gctgaggcag gcggatcacc 70560 

tgaggtcaga agttcaagac cagcgtggcc aacgtggcaa aaccctgtct ctactaaaaa 7O620 

atacaaaaag tagccgggca tggtggcgcg catctgtaat cccagctact tgggaggctg 70680 

aggcagggat actgctagaa cctgggaggt ggagcgtgca gtgagtggag atcacacctc 7074 0 

cacactccag cctggccgac agagcgagac tccatctcaa aaaaaaaaaa aaaaagagtg 7 0800 

ttagaaggtt ttgagataat gaataaaaga tgccttgtgt atactaagta ttcaacaact 70860 

gatagctgca ttggtctaat tataacagtt tagaagcgat tgagtcaaca aatgctggat 70920 

ttgtcaggga ggacttccta tcaggaggta gatcttgggc tgagtcctga agcaaagata 70980 

ggcattggat agaggagttg agagaacacc ctaggactgt tattattatt attcgacacg 71040 

gagtctcttg ctctgtcacc caggctggag tgcagtggcg cgatctcggc tcactgcaac 71100 

ctctgcctcc caggttcaag cgattctcct gcctcctaag tagctgagac tacaggtgtg 71160 

tgccaccaca cccggctaat ttttatattt ttagtagaga cagagtttca ccatgttggc 71220 

catgctggtc tcgaactcct gacttcaggt gatccacccg cctcagcctc ccaaagtgct 71280 

ggaataacag atgtgagcca ccgcacccag cccagaacca tttttcaatc cttggctctg 71340 

ccttttatta gctgcaagat ctcaggcaat ttatttaacc tctccaaaga ctcattttct 71400 

cattcacaaa atgaggcaaa taataatatc tactatccca ggttgtcatg agaattaaat 71460 

gcaacatgac atttaatgaa atgagaagtc ccttggacat taactggcta aagtatgtgc 7152 0 

tcgacaagga tatcatttta ggtggatact tagcatctca gaactgatgc tcacaatgga 71580 

atatcattga aacgcattaa aattcatttt aaatgattgt aggtagtgag gcaattgaaa 7164 0 

gaagaagaca agaggactga ttataatgct tcaggctcac tagtctcctt ttaggaggga 71700 

aaaacaattt caagttaaat tttaggctct agatttttac ccctgctgct cattagaatc 71760 

acccagattg atgaaatcag agcccatctg aggctgtgtt tttcatctcc agaatgagag 71820 

ctgttgtggg gattaagttt ttgaaaaagt acatctaaca ggtgatcgaa aatgatagtg 71880 

atattattgc agtgatggtc attattgttg ttattattat actgaaagag gcttcagttt 71940 

tctgatccat aaagtgaggg aattgcatga gaccattgct aagattcctt ctagctctgt 72000 

ttttttgttt ttgtttttta gacagagtct ctgtcgccca ggctggagtg caatggcatg 72060 

atcttggctc actgcaacct ccgcctcccg ggttcaaatg atcctcctgt ctcagcctcc 72120 

gaagtagctg ggactacagg cacacaccac catgcccagc taacttttat atttttaata 72180 

gaggtggggt ttcaccatat tggtcaggct ggtctcaaac tcctgacctc aggtgatcca 7224 0 

cccgcctcgg cctcccaaca tgctgggatt acaggcatga gccactgtgc ccaacccctt 72300 

ctagctttct tgatcactga ttctagggtt ctctgctgaa atatatttga gacatcctgg 72360 

ataaaagatc atgcaagagc tcccaatatg gtattaataa ttgattctgg aggcttagct 72420 

actcctgatg gattagacat gactcaactg cctctcttat gtgtacaaca caacaacaca 72480 

accaagaaag gttattctgg cattccattt attcagttta tttacagccc ttacttccag 7254 0 

cagcacgtta aagatatggc cagggccggg tgcagtggct caagtctgta atcccaggac 72600 

tctgggaggc caaggtgggc ggatcacaag gtcaggagtt tgagaatctg gcaattcttc 72660 

agacttagaa gcaaccagct cgataacaca gtcttgtgtg ggctctccct ctgtccctcc 72720 

ctcgcttccc tcatttctca tccctgcccc tgagactgtg caccttcaca tagccctgcc 72780 

atgagacctt catctcaggc tttgctttct ggggtaactg aggctaaaca ctgagtggcc 7284 0 

ctaaaagagg attgggattt ggaagttaga ttattcacca gagaacagac tttgctgatg 72900 

atcaggccca ggttgtaatt gttgaaaaaa agagaggatg catagtctta tctcatctcc 72960 

tagtcaaagt caacaccatg ataaataaga gtcaaatcct gagatgtgaa ttggggacat 73020 

ttgagtggtt aaccctgaga agcttgcacc ttcagacccc tcaatacccc tgctccccag 73080 

agaaggctgg acattgacct cagcacaggc aggagccctg caagatgcca tttgtcctac 7314 0 

taaagatgga cccctccact ctgtttctag gtaaataacc aaagtcaagt ctccacacag 73200 

cctgagcaag aaagtcagag cctgctacag gagaaaatac cacactggcc aaaggattca 73260 

ctagccctgg ccactgtgtg tgggaggaac cagggaatca tgtgtgggag tcaatgttga 73320 

agctgttgga ctgggggtgg ggtggaatat aagcctggcc ctggggagtt tttcccgttt 73380 

gagggccttt acccacaact caagatccag tgctatagca ggagatccca gagctagtcc 73440 

taacagatgg tcaggattga acttggccta gagtaaaatg aggaggatag tgccagaact 73500 
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ttctcaacat actattgagg aagaggtcag aaggcttaag gaggtagtgt aactggaaag 73 560 

gggtcctgat ccagacccca ggagagggtt cttggacctt gcataagaaa gagttcgaga 73620 

cgagtccacc cagtaaagtg aaagcaattt tattaaagaa gaaacagaaa aatggctact 73 680 

ccatagagca gcgacatggg ctgcttaact gagtgttctt atgattattt cttgattcta 73 740 

tgctaaacaa agggtggatt atttgtgagg tttccaggaa aggggcaggg atttcccaga 73 800 

actgatggat ccccccactt ttagaccata tagagtaact tcctgacgtt gccatggcgt 73 860 

ttgtaaactg tcatggccct ggagggaatg tcttttagca tgttaatgta ttataatgtg 73 920 

tataatgagc agtgaggacg gccagaggtc gctttcatca ccatcttggt tttggtgggt 73 980 

tttggccggc ttctttatca catcctgttt tatgagcagg gtctttatga cctataactt 74 040 

ctcctgccga cctcctatct cctcctgtga ctaagaatgc agcctagcag gtctcagcct 74100 

cattttacca tggagtcgct ctgattccaa tgcctctgac agcaggaatg ttggaattga 74160 

attactatgc aagacctgag aagccattgg aggacacagc cttcattagg acactggcat 74220 

ctgtgacagg ctgggtggtg gtaattgtct gttggccagt gtggactgtg ggagatgcta 74280 

ctactgtaag atatgacaag gtttctcttc aaacaggctg atccgcttct tattctctaa 74340 

ttccaagtac caccccccgc ctttcttctc cttttccttc tttctgattt tactacatgc 744 00 

ccaggcatgc tacggcccca gctcacattc ctttccttat ttaaaaatgg actggggctg 74460 

ggcgcggtgg ctcatgcctg taatcccagc actttgggag gccgaggcgg gcggatcatg 74520 

aggtcaggag atcgagacca tcctggctaa cacggtgaaa ccccgtctct actaaaaatg 74580 

caaaaacatt agccaggcgt ggttgcaggt gcctgcagtc ccagcggctc aggaggctga 7464 0 

ggcaggagaa tggcgtgaac ctgggaggtg gaggttgcaa tgagccgaga ttgtgccact 74700 

gcactccagc ctgggtgaca gagcgagact ccgtctcaaa aaaaaaaaaa aaaaaaaaaa 74760 

tagctgggca tggtggcgcg tgcctgtaat accagctact ctggaggctg aggcaagaga 74 820 

atcgcttgaa cccagtaggc ggaagttgca gtgagccgag atcttgacac tgcactccag 74880 

cctggtgaca gagtgagact ctgtctcaaa aaaaaaaaaa agaaaaaaaa agacagaaag 74940 

aaagagcaca gacagagtca caggtatttg cagtaggaag ctgtcaggtt agagtgcacg 75000 

gaaatagaaa gtatatttta cacttacagc acatcttcgt ttgattagcc acatttaaaa 75060 

tactgaatag caacgtgtgg ctatttagta ttcactaaaa tcttggacag tgcaagtcta 75120 

aagaatcctt gatccgtccg gcatggtggc tcacgccttt aatcccagca ctttgggagg 75180 

ccaaggtgga aggatcactt aaggtcagga gttcgagacc agcctggcca acatggtgaa 75240 

acctcgtctc tactaataat acaaaaaaaa ttagccgggc atggtggtgc atgcctgtaa 75300 

tcccaggtac ttgggaggct gaggcaggag aatagcttga atccaggagg cgctgcagtg 75360 

agccgagatc atgccatgcc actactgcac tccagcctgg gcaacagagt gagactgtct 75420 

caaaaaaaaa aaaaaaattg ttgggcgtgg tggctcacgc ctgtaatccc agcactttgg 75480 

gaggctgagg ggggtggatc acctgggttc tggagttcga gaccagcctg gccaacatgg 75540 

tgaaacccca tctctactaa aaatacaaaa attagctggg cgtggtggtg ggcacctgaa 75600 

atctcagcta ctcaggaggc tgaggcagga gaatttcttg aacccaggag gcagaggttg 75660 

cagtgagcca agatcgcgcc tctgcactcc atcctgggtg gcagagcaag actatgtctc 75720 

aaaaaaaaaa aaaaaaatac ttgattgtct ggacattctg cagaacatca tatggagaca 75780 

ctatgttgac gacatcatgc tgattgtaag caagaaatgg caagtgttcc agaaacacag 7584 0 

tcaagacaca tacatgccag aaggtgagat ataaactcta ctaagattca gtggcctgcc 75900 

acactggtga catttttaaa cctgctagat gtttgtgtag aaaaggattt aaccttgccc 75960 

aaagaggggt ctggcctttg tccccagcta ctggacataa tctctttaaa ctcttgaaat 76020 

atcattcctg atagaagtat ttttgttttg actaggggcc ttgggccagc cagatagcaa 76080 

caatgtgatc tgggttgggg gctttggatc aggtggcatc agtgtgacct cctgagtggc 76140 

tagagactag aatcaaccac atgggcagac aacccagctt acatgatgga attccaataa 76200 

agactttgga cacaagggct tgggtaagct ttcctggttg gcaatgctct atactgggaa 76260 

acccattctg actccacagg gagaggacaa ctggatattc tcatttggta cctccctggg 76320 

ctttgcccta tgcatttttc ccttgtctga ttattattat tattatgaga tggaatctcg 76380 

ctctgtcacc caggctggag tgcagtggaa tgatctcaac tcactgcaac ctctgcctcc 76440 

ccggttcaag cgattttcct gtctcggcct cccgagtagc tgggactaca gatgcatacc 76500 

accacacccg gctaattttt ttgtattttt agtagagacg gggtttcacg ttagccagga 76560 

tggtctcgat ctcctgacct catgttccgc ctgcctcggc ctctcaaagt gctaggaata 76620 

catgtgtgag ccaccgcgcc cagccccctt ggctgattat taaagtgtat ccttgagctg 76680 

tagtaaatta taaccgtgaa tataacagct tttagtgagt tttgtgagca cttctagcaa 76740 

attatcaaac ctaaggatag ccttggggac ccctgaactt gcagttggtg tcagaaataa 76800 

gggtgctcat gtgtgtacca tgccctctaa ttttgtagtt aattaacttt cacaacttta 76860 

ttattaccgc ttacactcaa tgtttattca catttatcca cataccactt attctagtgc 76920 

cttgcatcaa agactttcta tctcatgtac tttattctgc ttgaagtaaa tcctttagga 76980 

tattcttttt tttttttaaa ctttgcacat acatactttt attttttatt tatttttaat 77040 

tttgttattt ttgtgggtac gtagtagata tatgtattta tggagtacat gagatgtttt 77100 
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gatacaggca tgcaatgtga aataagcaca tcatggagaa tggggtatcc atcctctcaa 7716 0 

gcaatttatc cttcaagtta caaacaatcc aattacactc tttaagttat tttaaaatgt 77220 

acatttaatt ttgtattgac tagagtcact ctgttgtgct atcaaatata attttttttt 77280 

tttttgagac agagtctcac tcagtggccc agactgaaag tgcagtggca caagctcggc 7734 0 

tcacttcaat ctctgcctcc ctggttcaag cgaatctcct gcctcagcct cccacatagc 77400 

tgggattaca ggcacacacc accatgccca gctaattttt atattttttt agtagagacg 77460 

ggttttcgcc atgttggcca ggctggtctt gaactcctgg cctcaaatga tctgaccacc 77520 

tcagcctccc aaagtgctag gattacaggc atgagccacc acacctggcc aaaatagaat 77580 

attctttagt gaggtctgct ggtgacaatt tttttctttt ttttgagact gagtctcgct 77640 

gttgtcagct tgggctggag tgcaatagca cgatctcagc tcactgcaac ctccacctcc 77700 

cggattccag caattctcct gcctcagcct cccaagtagc tgagagatta caggcaccca 7776 0 

ccaccacacg cggctaattt ttgtattttt agtagaaatg ggggttcacc gtgttggcca 7782 0 
ggctggtctc gaactcctga cctcaggtga tccacccacc ttggcctccc aaagtgctgg . 77880 

gattacaagc atgagccacc acgcacagcc aattttttcc gtttttgtct gaaatcttat 77940 

ttcgcgtcat ctttgaaata tatttttgat ggatataaaa ttgttggttg atagttatta 78000 

tcattattat tattattttg agacagggtc tcactctgtt gcctatgctg gggtgtagta 78060 

atgtgatctc ggttcactgc agacttgacc tcctagggct caggtgatct tcccacctca 78120 

gcctccctag tagctgggac tacagatgca tgccaccata cccaactaat ttttctattt 78180 

tttgtagaga tgaggctttg ccacatttcc caggctggtc tctaactcct gagctctagc 78240 

aatccaccca ccttggcctt acaaagtgct gggccatgac tagccagcag ttacttttta 78300 

tagcatattg aatatttaat atgaatcttc tggcatccac tgtaactgtt taaaaaatca 78360 

gctgtttact tggcactctt tttttttttt ttttttttga gacagagtct tgccctgtcg 78420 

cccaggctgg agtgcagtgg cgtgatcttg gctcactgca agctctgcct cccgggttca 784 80 

cgccattctc ctgcctcagc ctccggagta gctgggacta aaggcgcccg ccaccacgcc 78540 

cggctgattt ttttgtattt ttcgtagagt tggggtttca ccgtgttagc caggatggtc 78600 

tcgatctcct gacctcgtga tctgtccgcc tcggcctccc aaagtgctgg gattataggc 78660 

gtgagccacc gcgcccagcc tctttttttt ttttttttag acggagtctt actctgtcat 78720 

ctaggctggt gtacagtggc gtgatctcag ctcagtgcaa cctccacctc ctgcctcagc 7 8780 

ctgccaaata gctgggatta caggtgcgta ccatcacgcc cggctaattt ttgtattttc 7884 0 

agtagagatg gggtttcacc atgttagaca ggctggtctc gaactcctgg cctcaagtga 78900 

tctgcctgcc ccagcctccc aaagattaca ggcatgagcc accgcacccg gccaagtagc 78960 

actcctttga aggtaatctg cttcccctac ccctagcaat ttttaacaat ttttcttcat 79020 

ttttatttcc tgaagttttg ttattaataa tctgtgtgca gatttctttg tatttctttt 79080 

gtttgcagtt catagtgatt cttgaattag tgtgttggtt tctgttatca ccacaggaaa 79140 

attgtcagcc gttagctttt caaatatttc cttgctaaat tctctcttct cccctttcgg 79200 

tacaattgat ttgattaaaa ctaaaaccag ggccgggtgc agtgactcat gcctgtaatc 79260 

ccaacacttt gagaggctga ggcaggtgga tcacctaagc tcaggagttc aagaccagcc 7 9320 

tggccaatat ggtgaaaccc cgtctctact aaaaatacaa aaattaccag gcatggtggc 7 9380 

acacatttgt agtcaggagg ctgaggcagg agaattgctt gaatccagga ggtggaggtt 79440 

gcagtgagct gagatcccac cactgcagtc tggcctgggc gacagagtga gatgagaatc 79500 

tgtctcgaaa aaaaaagtta tgaatgtttg ataaactata tttgttagaa tgtttgttgt 79560 

agaatactat tcattgattt ttaaacaatg ttagattaaa ccattcactg gatttgtgat 7 9620 

aattaactta ctgattttac ctcactgatt tgttgtaatt aatacaactg gtataaaaag 79680 

actgtgacga ggccgggcat ggtggctccc gcctataatc ccagcacttt gggaggctga 79740 

ggcaggcgga tcacctgagg tcaggagttc aagaccagcc tgaccaacat ggtgaaaccc 79800 

catctttact aaaaatacaa aattagccgg tcgtggtggt gcatgcctgt aatcccagct 79860 

cttcgggagg ctgtggcagg agaatcactt gaacccggga ggtggaggtt gcagtgagcc 79920 

gatatcgcgc cattgcactc cagcctgggc aacaagagcg aaactccgtc taaaaaaaaa 79980 

aaagaaaaaa aacacataaa acaaaacaac actgtgacgg ttcccaaaaa ttaggagcat 80040 

aattaaagga actcctgata aaaattaatt ttatcttaca tgtaaactaa aatgacttta 80100 

tgaagttaat tcagaaatac aatgcagggt attagtttgc cacagctgcg tattcagcct 80160 

aatgtaatat tcttgttatt tttaaattct tcttttaact ttactcatat gtggatcatc 80220 

aaatttcaaa agattaaatg acaatactct tagcagcaag cttccctaag catataaaca 80280 

ttttaatggg tgatgattca gaaggtaccc gaagaatatg tactgccaga tatcattcac 8034 0 

ccccatatac ctgcccgaca gacatcccat tttgggaccc tggataaatg tgtgggtgga 80400 

gagaaagata ggagaaagtg gtataagcaa atggctttgg agtctgattg acagcgattg 80460 

aaatcctgtc tctacctctt aacagcctca tgatcctaca taagttaccc cgatcctcag 80520 

ggccacatct gtaaattggg ggttgcgatg gcagccatct cacagggtct cttttcgggg 80580 

aagggcagga attatggatt aagtgagcta gtaattgtaa agcacttaat acaaggaggg 80640 

cgcataataa gtacttcata aataatgacg gccattatca tgactgaggt gtatgcagct 80700 
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gtcggggatt acggcgactt cagaatttct ggtgggcagg gctcaaaggc agcaaatcac 80760 

actggaagtc gaggtgaggc actgcttctg cacagactgc ttagctggag agaatgagga 8 0820 

aggcttagag gagatttaga ggaacttaga gtcctccgcc tccaactctg tgggatctgc 808 80 

tcccgtgcca gagacattca ggggatttct cgcactctcc cctcccctac gtccctcccg 80940 

ccccatccaa ctaaccacac aacacataca aaatagcccc tgcgaggttc tgcacgctgg 81000 

aagggaacag gagaagggcg ctgcgctttc ttgctgatgc cctgtacttg ggcccctggt 81060 

agacacagcc acttgtcccc tcagcctgca gagaaatccc acgtagaccg cgcccgggtc 81120 

cttggcttca gccaatctcc ctttggtggg ggtgggatgc acgatccaag gttttattgg 81180 

ctacagacag cggggtgtgg tccgccaaga acacagattg gctcccgagg gcatctcgga 8124 0 

tccctggtgg ggcgccgctc agcctcccgg tgcaggcccg gccgaggcca ggaggaagcg 81300 

gccagaccgc gtccattcgg cgccagctca ctccggacgt ccggagcctc tgccagcgct 81360 

gcttccgtcc agtgcgcctg gacgcgctgt ccttaactgg agaaaggctt caccttgaaa 81420 

tccaggcttc atccctagtt agcgtgtgac cttgagcagt tgactttatt tttcagtgcc 814 80 

tagttttcca gataccagga ctgactccaa ggactattac tcatctggag ggtttagcac 8154 0 

agtaccgtcg catagtaaat ttccatgtca gttttggtta cctttcatgc acttgcaaac 81600 

atgccatgct ctgaaacgaa ataggcacat cttttttttt ttttttttta aggagtcttc 81660 

ctctcgccca ggctggagtg cagtggcgcg atcttggctc actgcaacct ccacctcccg 81720 

tgttcgagat tctcctgcct cagcctcctg attagctggg actacaggca tgccacgacg 81780 

cccagttaat ttttgtattt ttagtagaga cggggtttcg ccatcttggc caggctggtc 81840 

taactcctga cctcaggtga tctgactgcc tcagcctctc aaagtgttgg gattacaggc 81900 

ataagccact gcatctggcc agaaatgaaa taagtaaatc ttttaacctg ctctaacaat 81960 

atagtgaaaa gaccatatta ttattagagc aggttaaggg atttgcctat ttcgggttct 82020 

agttatagtc ttaaacttgg acattcttgt agaaagtaaa aagtttcctc ttcaaagttc 82080 

cccttcttgt taaagaatac atcataagtg ttagaagtaa tagtttattt taaagactaa 82140 

ctttcttcaa gcctccttgc tttgtgctaa taactctttg ttaagcccta tcctatgtaa 82200 

ctgttggaca tgctcacagg cacgttccag ttcacagcct atgccccttc cttatttgga 82260 

aatgttattg cttccttaaa cctttcggta agcaacttcc tctccttctt cgttcttcct 82320 

tgcacttacc tatttagaaa gttttaggct attagcaaat cggctatcag tttaagagtg 82380 

tgaggtcccg ctccagccaa tggatgcagg acatagcagt gaggacgacc caaatgcgta 82440 

agggataaat atgtttgctt ttcctttgtt caggtgtgct ctcgacatcg ttccatctgc 82500 

gattgagcac cctttctgca gaaagtaaag attgccttgc tggagatctt ttgtctccgt 82560 

gctgactttt cttcgtggca ccgattatct atttctaaca attttggtat ttctaacatt 82620 

ctgaacaatc ttgggctagt tgtctcttct gggcctgttt ccccatccgt cacatgataa 82680 

acttcattgg tttaaaaacc ccagcgaaca tttattgagt tactattacc ttcctgccct 8274 0 

ccccaacccc aaccccaggg agcagttaca acctcagccg ctgagcgcac tcgccgggtg 82800 

ttaagaagca ccaaagacag ggaggcttga ttgattttgc tttgggagta gagggtcaga 82860 

agattcacag gaaaatggca tttgagcaag gatgattcac tggagctagc ttttaaatac 82920 

tggcgaggct tttatgttgc agtcccttac aaagttgagc attcgcaggg actgcactcc 82980 

gaaataagcc cgcttcccct tttcattcgc taatgatcca gggagctgct ggttccgcat 8304 0 

gcggcaggtt gtgccttttc ctaatcaggg ttctgcatcg cctcgaaccc gcaggccgtg 83100 

gcgggttctc ctgaggaagc agggactggg gtgcagggtg aagctgctcg tgccggccag 83160 

cgcctgtgag caaaactcaa acggaggagc aggaggggtc gagctggagc gtggcagggt 83220 

tgaccctgcc ttttagaagg gcacaatttg aagggtaccc aggggccgga agccggggac 83280 

ctaaggcccg ccccgttcca gctgctggga gggctcccgc cccagggagt tagttttgca 83340 

gagactgggt ctgcagcgct ccaccggggg ccggcgacag acgccacaaa acagctgcag 83400 

gaacggtggc tcgctccagg cacccagggc ccgggaaaga ggcgcgggta gcacgcgcgg 83460 

gtcacgtggg cgatgcgggc gtgcgcccct gcacccgcgg gagggggatg gggaaaaggg 83520 

gcggggccgg cgcttgacct cccgtgaagc ctagcgcggg gaaggaccgg aactccgggc 83580 

gggcggcttg ttgataatat ggcggctgga gctgcctggg catcccgagg aggcggtggg 83640 

gcccactccc ggaagaaggg tcccttttcg cgctagtgca gcggcccctc tggacccgga 83700 

agtccgggcc ggttgctgaa tgaggggagc cgggccctcc ccgcgccagt ccccccgcac 83760 

cctccgtccc gacccgggcc ccgccatgtc cttcttccgg cggaaaggta gctgaggggg 83820 

cgccggcggg gagtcaggcc gggcctcagg ggcggcggtg gggcaggtgg gcctgcgagg 83880 

gctttcccca aggcggcagc aaggccttca gcgagcctcg acctcggcgc agatgccccc 83940 

tgagtgcctt gctctgctcc gggactcttc tgggagggag aaggtggcct tcttgcgcga 84 000 

ggtcagagga gtattgtcgc gctggttcag aagcgattgc taaagcccat agaagttcct 84060 

gcctgtttgg ttaagaacag ttcttaggtg ggggttagtt tttttgtgtt tctttgagga 84120 

ccgtggatca agatcaagga aatctcttta gaaccttatt atggaagtct gaagtttcca 84180 

aatgttgagg gttttatgtc taaaagcaac acgtgaaaaa attgttttct tcacccagtg 84240 

ctgtcttcca atttcctctt tggggggagg ggtagttact gctgttacta aaataaaatt 84300 
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acttattgct aaagttcccc aacaggaaga ccactacttt tgatgacttt ggcaagtttg 84360 

ctaactactg gaaccctaac ttacaaacga actacttaca tttttgattt ccagttgtat 84420 

tacctgccca atgtttacgt agaaacagct taattttgat tctgggtaac gttgttgcac 84480 

ttcattaaaa atacatatcc gaagtgagca agtatgggtc tgtggacagc agtgattttt 8454 0 

cctgtcaatt cctgttgctt cagataaaat gtaccagaca gaggccgggc gcggtggctc 84 600 

acgcctgtaa tcccagcact ttgggaggct tggcgggtgg atcacctgag atcgggagtt 84 660 

caagaccagc ctgaccaaca tggagaaacc ccgtgtctac taaaaataca aaattagcca 84720 

gggtggtggc gcatgcctgt aatgccagct acttgggagg ctgaagcagg agaatcgctt 84780 

gaacctggga ggcggaggtt gcggtgagcc gagatagcac cattgcactc cagcctgggc 84 84 0 

aaaaagagcg aaactccgtc tcaaaaaaaa agtaccagac agaaatgggt tttgttttct 84 900 

ttttttgttt tgagacggag tttcgctctt gttgcccagg ctcgagtgca atggcgcgat 84 960 

ctcagtctcg gctcactgca acctctgtct cccaggttta atcgattctc ctgcctcagc 8502 0 

ctcccaagta gctgggatta cccatgcccc accatgcccg gctaattttt gtatttttag 85080 

tagaaacggg gcttcaccat gttaggctgg tcttgaaccc ctgacctcaa gtgggcctcc 8514 0 

cacctcggcc tcccaaagtg ccaggattac aggcatgagc caccgcggcc agccagaaat 85200 

gggttttgga aaaagcacta aacaaaatcg aacttggttt catatgacag ctctgctgct 85260 

aactgtaaca ggggcagacc agttaaccta cttttctgtc ttctgtcagc tgagaattag 85320 

atgattccca aaggcccatt gaactctgaa tgactttaaa tacttcttct taagtgggta 85380 

cacggttttg gtaactgatg ccaggtgatg aatgcatgaa agtgcttaat gaatgaaacc 85440 

ggtaaaatag taggaggaag ctttattggt aaggcagggg tatacctaat agctctctaa 85500 

tttattggta ttgaagtggt taacttttgt ttttttaagg ggggaaaaca ttctaagaat 85560 

aatgaggcaa actgcatatt gcacaagaga ctgttgtctc tattcaacaa ataccttttg 85620 

agtgtccaga gtctgccagg tgctgtgcta ggccctcacg attgagtagt gaaccagaga 85680 

atgtccctgc acccatggag cttattgtct actggggtag acagataata aataagcaaa 8574 0 

caaatcttct ctcttctccc tttcgctcca tgtaagtgtg tgtgtatagg tgtatactta 85800 

caagttgagt aaagtgttat gaaagattaa gaggagaaat gcattttggt tagatgttag 85860 

aggactcagc aggtgacctt gaaacttaga gctgaaggat cagtaggagg taactagaga 85920 

ggccagggaa tcgcatgttc aaaggccagg aggcaagaaa gagcatggtg cccttcaaga 85980 

gaggaaagaa ggctactgtg actggagcat agatgtaggc aagtgttggg tgattgagag 8604 0 

ctctacgggc catggttagg ttttattcct aatgccgaga tgccaaacat ggtggttcat 86100 

atctgtaatc ccagtatttt aggaggccga ggcaggaata tagcttgaac ccaggagttc 86160 

aagaccagcc tgagcaacat gagacctgta caaaacattt aaaaaattgc tgggtatgat 86220 

ggtgcacacc tgtggtccca gctactcagg aggctgaggc agaaggatca cttgagccta 86280 

ggaggtggag gctacaatga gccatatttg agtcactaca ctccagcctg gatgacaaag 86340 

tgagaccatg tgtcaaacaa aatacagaaa gaatattaat ttaaaatttt gaaagaggag 86400 

tgatctgaac ttatatctta aaaagatcat tctagggcat ggtggctcat gcctgtaatc 86460 

aagggctttg ggaggctgag acaggaggat cacctgaggc cagttcgaga tcaacctgta 86520 

cagcatagag agactccatc tctacaaaaa gaaaaaataa atagctgggt gttgtgagtt 86580 

attcaggagg ctgaagcaga aagatcactt gagcccagga gtttgaggct gcagtaagct 8664 0 

atgatcccac cactgcaaca cagtgagatc ttgtctcaaa aaaaaaaaaa aatcattcta 86700 

ggtgcttttt ggaggctgga tgtggtaaga gtagaagctg gagatggtcc tgttagggat 86760 

tcgattcaga ctttaaatac catcaatgca ttgagtccca aatttacatc actacgttgg 86820 

atccttgccc ctgaatccag actggtatat ccaactttag gttcagtttg tatctctacc 86880 

tgaccaatat agaggtgtcc agtcttttgg cttccctagg ccacattgga agaagaattg 86940 

tcttgagcca cacatagagt acactaacgc taacaatagc agatgagcta aaaaaaaatc 87000 

gcaaaactta taatgtttta agaaagttta cgaatttgtg ttgggcacat tcagagccat 87060 

cctgggccgc gggatggaca agcttaatcc agtagatacc ttcaacttac aatatctaaa 87120 

attttatgcc agatttagtc attttaaacc tgctcatcag tttttctcaa gaagtagtat 87180 

tttggctttt tttcttttct tttttttgag atggagtttc gctcttatcg ttcaagctgg 87240 

agtgcagtgg cggatcttgg ctcactgcaa cctccgcctc ctgggttcaa gtgattctcc 87300 

tgcctcagcc tcgcaagtag ctggaattac aggcatgcgc caccatgacc agctaatttt 87360 

tggagacagg gtttcaccat gttggtcagg ctggttttgt actcctgacc tcaggtgatc 87420 

tgcctgcctc ggcctcccaa aggctgggat tacaggcatg agccaccgct cccggctgca 874 80 

tttttggatt tttagttgct cagcccaaaa ctttagtaca tctttgaacc tcttctttcc 87540 

tcctactcta tatctgatcc atcagcaaat ctgttaggtc tacctcacac atatcgaaat 87600 

cctaccacgt ctcaccatct gtgacaatta acaccctggt ctaggcagtc atctctgtta 87660 

agattgagtg gttaaggatg tcctctaagg agatgacatt caaatcttag cttaaatgtc 87720 

aagagggagc tggttttata aagattgagg aggcagcatt attttgccat aggcttccat 87780 

ttggtttcca ttccattctt gatacttatg gtatatattc aaaacaaatg cacagaaaca 87840 

gacccaggta tattgggaat ttcggatata gagttcctag ttgggaaaag atagactgat 87900 
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ctgtaaatga tgctagttat ccatcatctg gcaaaaaata atttcctgcc tcctctcata 87960 

tatctcagat caacagactt tttctgttaa gggccaaatc ataaatattt taggctttcc * 88020 

agaccatatg gtttctgtca cactctcctt tatccttgaa gccatagaca atatgtaaac 88080 

aaatgggcat ggctgtgcta cgataaaact ttacttacaa aaactggtag tgggccagtt 88140 

taggcatggc cagcactttg ggaggctaag gcagatggat cacttggggt caggagtttg 88200 

agaccagcct ggccaacatg gtgaaaccct gtctctacta aaaatacaaa aaatagctgg 88260 

gcatggtggt gggtgtctat aattccagct actctggagg ctaagacaca agaatcactt 88320 

gaacccagga ggcagaggtt gcagtgagct gagatagcac cactgcactc cagccagggt 883 80 

gacggagtct taaagcaaaa caaaacaaaa ggtagtgggt tgtatttggc ccatgggctg 88440 

tagtttgcca atccctgatg cagaaacaaa ttccaggtaa ataagagcct ggaatgttaa 88500 

aaaaacaaaa cttgaagtca tgtagaagaa caggtagggg gaacaatcct gatctcagga 88560 

taggaaggga tattgcttaa aataagacac aggaaaatat aatccatgtt gtgtaaattt 88620 

gactacgtta aaacttaaaa ctttcgccaa gcgcggtggc tcacgcctgt aataccagta 88680 

ctttgggagg ccgaggtgag cagatcacca ggtcaggaga ttgagaccat cctggctaac 88740 

acggtgaaac cccgtctcta ctaaaaatac aaaacattag ccgggcgtgg tggcgggcgc 88800 

ctgtagtccc agctacttgg gaggctgagg caggagaatg gcctgaaccc gggaggcgaa 88860 

gcttgcagtg agctgagatc gcgccactgc actccagcct gggcgacaga gtgagattcc 88920 

gtctcaaaaa aacaaaacaa aacaaagcaa aaaacctaaa actttcatac aataaagtat 88980 

acctaagata cttctagaag agaagattta catccaggac gtgtatggaa tttctgcaag 89040 

taataagtaa aagacaaggg acatgaagag gcagttcaca aaagaggaag ccaaaatgac 89100 

caataaacat gaaaggatgt ttaacctcaa aggaaacaag gaaatgaatt aaaaacatca 89160 

aatgccattt caaaactagt aagttggcaa aattaaaaat accaaggatg agaatatgaa 89220 

gcatggctat atgagtgcat ggaatggtac agtcactttc attaaaaatg cacataattt 89280 

gttttttatt tatttttttg agacagtcta tgtcgcccag gctagaatgc agtggcatga 8934 0 

tctcggctca ccacaatctc tgcctcctgg gttcaagcaa ttctcctgcc tcagcctcct 89400 

gagtagctgg gattacaggc acatgccaca acgcccggtt aagttttgta tttttagtag 89460 

agacagggtt ttgccatgtt ggccaggctg gtctcgaact cctgacctca ggtgagctgc 89520 

ttcccaaagt gctgggatta gaggcgtgag ccaatgctcc tggctgaaaa aaatgcacat 89580 

aatttgttac ctagcaattc catgtctaga ggcttatcct agagaaattc ttgcttatat 89640 

gcataggaag acgtgtacta gaatgttcac tagttgaatg tttaagtgaa aattaggaaa 89700 

taaagtaaat gttcattaac aggaaaatga gtaaaggtat atttataaaa caattaagta 89760 

gctaaaatga ataaactaga gctgcgtgaa tgaactagaa ctggttcaat agtcatgtca 89820 

gattattgaa tgaatacagg tcagatatgt atagagtgtc atttgtgtaa ttaatttttt 89880 

tttttttttt gagatggagt ctcactctgt tgcccaggct ggagtgcagt ggcgtgatct 89940 

cagctcactg caacctccac ctcctgggtt aaagtgattc tcctgcctca gcctcccgag 90000 

tagttgggat tacaggcatg caccaccatg cccagctcat tttcctattt ttagtggcca 90060 

cagggtttca ccatgttggc caggctggtc ttgaactcct gacctcaagt gttccaccca 90120 

acttggcctc ccaaagtgct aggattacag gcgtgagcca ccgtgctcag ccatttgcgt 90180 

gatttttaaa gatgtgcaga ataatgccat taaaaaaaat acacatacat gtatatatat 9024 0 

acacgtttgg ctgggtgtgg tggctcacac ctgtaatccc agcactttgg gaggctgagg 90300 

caggaggatc acttgagccc aggtgtacaa gactagcctg ggcgagatag caagacccca 90360 

tctcaacaac agaaaggata attaggtatg gtggcatgag aggatcactt gagcccagga 90420 

gttcgagtgt tatcaggcca ctgcactcta gcctggacaa caaagcaaga ccgtgtctca 90480 

aaaaaataaa aataaaaagt atttgtatgt ggtcatagtc aaaaaacgta catggaagga 90540 

aaatgtcttt atttatttat ttattttttt ttttttaaga cagagtcttg ctctgtcacc 90600 

caggctgggg tacagtggtg taatctcagc tcaccgcaat ctcggcctcc cgggttcaag 90660 

cgattcttct gcctcagcct tctaagtagc tgggactaca ggtacccgcc accacaccct 90720 

gctaattctt gtgttttcag tagagacagg gtttcaccat gttggcaagg ctggtctcga 90780 

actcctgacc ttaagtgagc cacccgcctt ggcctcccaa agtcctggga ttacaggtgt 9084 0 

gagccactgc gcttggccag gaaatatcta atttagtaag tatttatatc tgggaaagga 90900 

agggtcaggt ggtgattcat aggaactcta aagtctatgt ataatactta gggggacaga 90960 

aggaaataaa gcaaaatgct gatatttgat tgttgagttg tgtatatgtt agaagtataa 91020 

cataggagat ctgattgata gtaggagaat gtttttaggt ggtaaaagtg gaaccgtggt 91080 

ggtttgtttt ggcagtagaa tcagttggtc atagtttgta tgtggaaggt aataaacaga 91140 

ccatgttaag gatgacttcc ggaattttgg tctgagtagt gggtggatga cagtgtcatt 91200 

catgagggaa gatgaagact gaggtaggaa caggtttggg agaagatgac atgttccctt 91260 

ttagacaagt ggaattatgg aagatggcag gtaggtggtt agctatatga atttgagata 9132 0 

aaagatttag gatggagata taaatttagg agtaacagcg tatctatggt attgtaagcc 91380 

ttaagaatgg gtaggatcag ccaggaaata cagatgtata tgcagaagag aggagtcaag 91440 

gaagccaaga caagttaatg tttaaagtga gtgatgtagt ccatgggcag atgctgctga 91500 
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gagggctgca aacaccagtg accctacaac atttttaaat gtcgtcttcc tgacagcagt 91560 

gatcagtacc tgcaacgatc ttatttattt ttttcatgtt agtctccaca cacttgaatg 91620 

tagacttttt gaaggcaaaa tcattgcctt ttctgagctg ggagcatgtc tggcacatac 91680 

caagcactca acagttgatg tattgacttc atccagatac tctgagggcg agttatttcc 9174 0 

tgctactagc ctttcacctt tcaatgttta agagcacaaa tacagagatg ggcacgtttt 91800 

ggcatttctt attttgataa ccttttcctg gtaagatttt ttaatgttga aaaaaaaaaa 91860 

caagaaaaga gggttaaaaa tagtcttatg tcagatcctg tgatagaatt cacacttggc 91920 

ttaagctgct gggcaccttc ctatcttgga tgtcatatta gcttatctac agcagaattt 91980 

ttactgtttt atgtagtaag gaagcaatta tatgattatt ttacagacaa attattcttt 92040 

atcttttatt tttttagacg gagtctctct ttgtctccca ggctggagta cagtgtcgcg 92100 

atctcggctc actgcaacct ccgcctcctg ggttcaagca attctctgcc tcagcctccc 92160 

aagtagctgg gcttacaggt gtccgccacc acacccagct cattgttttg tatttttagt 92220 

agagatgggg tttcaccatg ttggccaggc tggtcttgag ctactgacct caggtgatcc 92280 

acccgccttg gcatcccaaa gtgctggaat tacaggcgtg agccaccgtg cctggcccag 92340 

acaaattatt atactctgag tgttagaggc ttaggatgtt ttcacttgat gctatgggag 92400 

gaataagtaa taagatatga tacacaacca aagacctttc ttcactatgc ttctagtagc 92460 

tagtactatg gatgacacat ggtaataata ttggttagca tttgtcctca atttactgtg 92520 

ctagttactc ttctaagccc cttacaggta tatatttttt ttcatcaata atcctctaag 92580 

gtagtcttta ttattgacct aattttataa atcaagaaaa ttaagaccca gagaagtaag 92640 

taacttgtcc aagatcacat ggcttataag tggtagagcc agaatttgac cccagatgtt 92700 

gtgactacat tgtctctcca taagcaggtt caactctttt gactggatgc tgttccaagg 92760 

tcacttcctt agagaagcct ttgctgacaa ctaccctcct gtgccctcct ccaaggctgt 92 820 

ccattgttct agaactttga atactcatct tagaataaag ctggtctaat ttttacagtg 92 880 

ttatagaatg gatctctgac tgcaaaagtt ggtcataatt atctttttat gttctagtga 92940 

aaggcaaaga acaagagaag acctcagatg tgaagtccat taaaggtaag ttctgccctt 93 000 

ggcagtccac tgcattaaaa agtgatgtgc tttgcatttg tgagttcttt aatcctgtta 93060 

tactctctct tttggcatta atcatttctg ccttatttta taattactta tgattttgat 93120 

ttatttccct ctttaacctg tataatgctt taacatctag catataataa gtaggctttt 93180 

tttttttttt tttttttgga gacggagtct tgctctgtta cccaggctgg agtgcagtgg 93240 

cgcgatcttg gctcactgca agctctgtct cccgggttca caccattctc ctgcctcagc 93300 

ctccccagca gctgggacta caggtgcacg gcgccacgcc tggctaattt tttgtatttt 93 360 

ttagtagaga cagagtttca ccatgttagc cagtatggtc tcgatctcct gaccttgtga 93420 

tccgcccgcc tcggcctccc aaagtgctgg gattacaagc gtgagccacc gcacccggcc 93480 

gtaagtaggc tttttttacc ttaattttat ttttttgaga tggagtcttg ctcttatccc 93540 

caggctggag tgcagtggtg ccatctcggc tcactgcagc atccacctcc cgggttcaag 93600 

cgattctcct gcctcagcct cccgagtagc tgggattaca ggtggccgcc accatgccca 93660 

gctaattttt gtatttttag tagagacagg gtttcaccgt gttggccagg ccagtctcaa 93720 

actcctgacc tcaagtgatc cactcgcctt ggcctcccaa agtcctggga ttacaggcgt 93780 

gagccaccat gcctggccat aagtaggctt ttactgagcc ttgtgtgtat tggctatcct 93 84 0 

agtgattaca gtgaaccagt gcccttctta ttaatcacac atttaattgt tccctaaaag 93 900 

tgattagttc actttattta tttagtaaga caaaaaatga agaatactct taactgagca 93960 

gtctgttaac tgtaggaaag cactgacact tataaggctt agttttctgt catttatcca 94020 

gaagtatggt tgattacagt ttttactttt ttatttgaat gaacaacctt aatttaaaat 94080 

atattttgtt tattttttgt tgggatcgat acattgtcct tgtttataga ttagagcatg 94140 

ctttttaaag atgctgtatt actcactgat tttatttgtc cagtgtacag agattgaagt 94200 

gggaaaatta taatggaaat tgtttccata gtcattacat attaatttca tcaatttatt 94260 

tccataaaat ctgtagattg ctacttattt agatttttcc ttcaaatgtt tttatgttgt 94320 

attgcttgca ctgagtattt attctatatg ctcaatttgc tggagaagaa gactaattat 94 380 

aacttaggca agttgtaaaa ttagggaaaa aagtaaggta ccttacagcc tagtttactt 94440 

atttcttatg taaagccagt tagattccac attagttcaa actgccttct ttgagcaaaa 94500 

cttgattggc agtgataaag gcttaaagcc cttctcaagc agagacctgt aaagactaga 94560 

tctgactgta gtagaaggaa ggaacttaga tgtttcaggc agtgagaaca ccagtcttcc 94620 

actctaaact ttgccactaa cagtatgacc ttgggaagtt gtaactttct tcagattctt 94680 

catttgttga atggggggat tggcctagct aatttctaaa tctctactgg gctaaaaaat 9474 0 

tctgtgctta tactctgatt atgaagtaca taatctgtgc ttaacattca ctgacttatc 94800 

cttaggataa tacagaagca gtacaagaaa cagcccctca agatgtttgc agtctggtta 94 860 

gaaagacaaa cttatacaca gaacagtagc aaatagacca aaataataat agctgccatt 94 920 

tatagaacac ttcttctgtt ctgggcatta gacaaaaact gactataacg gtgaacaaaa 94980 

aagacttagg tcctgccctc attgaactta cagattagta ggggagagga acattaatca 9504 0 

agtaattcca cagatggctt agcctagatt ggtagtgatg gaagtaaaga gatgtgaacg 95100 
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gacttgaaaa aaaattcgga ggcaaaatgg atagaagttt attattgatt aaatatgagg 951 

tgtgagagag agggatattt aagattgata cctaccttct ggcttgccta acagaaccaa 952 

dacaggaaat tatatgttca gttttgttat gttgggtggg aggtgctttt gagtcattca 952 

tttatatatg ttatatatgt tattttatat gcatagtaat tttaaggtct gagttttaaa 953 

ccaaaggtta gagagtgatt ttttagagtc tagcaaacct aagttgaaat cctgcctgtt 954 

gaaatggctg tttactagct cattaaccta gggcaaagta ttcaacttgt tttcattttt 954 

gtcttcatct ctaaaatgag gaaaatatgg tcttacaaga ttgtcctgag agatagatga 955 

aataatatcc aaaaaaaaaa aaggtacata gagaaactcg tatagtgcct ggtatatagt 955 

aggtcctcca ttggtagcta tcattatcta gttttaacat agccttcagt ttgttgaatt 956 

. agtcaaactg agtgaagcac tgcaaggaat tcagaggaat ttgagatcaa caaatgattt 957 

ctcjaagttta gggaagactt catggcaatg acacttacct tgtataaaag ttgaagaata 957 

agaaagattt gaatgagaga ttctttctct tctccctacc agcccagctt cttatttgag 958 

gatarattgg gcaaaggggc cttcagacaa gtagagggag atttttacag aaagattgag 958 

atgaaqgtat agaaggctgt aaagaccaga aaagagaatt gagacagagg aagcaggaag 959 

ccactgtagg tttttgagca agatattgat gctgtaagta tggtgtttat gaaaggttag 960 

tctggaagag atttgcagga tggagacccc ggaagttttt ttgttataat acagaaagac 960 

trgcactgag ggtgaggtgt taaaaataaa caggtaagta aatgtttaaa catcttgaag 961 

qaaaagtcaa caaatcttgg caagtaaaca gataacagtg aaaaagaatg ggaccaagat 961 

tttgagtttt ggagactggt ggattgaaca gacagggaaa ttgagaggag aatcagatga 962 

tgatgtttta agttgatatt tagacagatx gtgcttgaga tggtaaagtc aatgtgggtg 963 

ggaatgctxa gtagcgagta atcagtgata caagaccaaa gcccaggtca aagacaagtc 963 

acagatacag atcagggctt tttcatctgc tccacagagg tgtaccctag gagctgttgc 964 

aaacagccca tgtggagggt gtgagtaaga tgtttccctt gaatttgcca gaattacttt 964 

tttgttgttg ttgttgtttt ttctgagaca gattctcgct ctgttgccca ggctggaggg 965 

cagrggcgag atcgcgcagc tcactgcaac ctctgcctct cgggttcgag tgattctcct 966 

gcctcagcct cccaagtagc tgggattaca ggcttgtgcc accaagccca gctaatttct 966 

tttgtatttt tagtagagat ggggtttcac catgttggcc agactggtct cgaactcctg 967 

gcctcgtgat ctgcctgcct cagcctccaa aagttctggg attacaggcg tgaaccactg 967 

cacccggtcc cttgttaagt ttattttggt gggaagcaaa ggaggtttca gcttttaaaa 968 

agtttgaaaa ttattgctct ggtaaiiaatt aaagatttga gagtaaatat gctttctagc 96 9 

agaaagaata aaagaagaac agatagcctc aagaagggga gccaaagaag caggctatat 969 

ctgacacact gggtgttgat aaatgggtat taaaagaatg agagcaatga gcagatagaa 970 

gaggaaatta ggagagtata ataccatgga gaccaagaaa gatagactat caggaaggag 97C 

tggtaaaaat aagttactag ttcfcaagaga gatgttaaga gggaccgggg aaagccttgt 971 

acaaatgagt tagtagcatt ttacattata tacatctaat taagaaacaa tgcgagagtc 972 

tcaccattcc tatagactct tacttgtact tgtctgaaca cgaaaactgg cttttgttta 972 

taaataagct aaaaattatt ttgctccaat ttctcatgaa aataaaaata aaccttcttt 973 

taacattgaa aaaatagttt gaagacagtc actcttcatt ttgtaattcc cacaactatt 973 

attgaatgac tgaaattatc tttattctga agccaaaggg gtgatactga tatttcttca 974 

gactactaaa aatatatttt atgaattttt agtgtgcttt atcttttttt gttttttttt 975 

ttgagatgga gtttcactcc cgttgctcag gctggagggc agtggtgcaa tctcagctca 975 

ctgcaacctt cgcctcccag attcaagcaa ttctcctgcc tcggtctccc aagtagctgg 976 

gattacaggc acctgccccc acacccagct aattttttgt atttttagta gagacagggt 97£ 

Ctcaccatgt tggtcaggct ggtcttgaac tcctgacctc aggtgatcca cccaccttgg 97' 

cctcccaaag tactgcgatt gcaggcatga gccaccatgc ctggcctgag gaatattttt 97£ 

ctaggttccc cccaccccaa gcatttattc tgcaatttta gttttgttcc taaagcaagc 97£ 

aaggcttaag gatttaaaaa taatccgtat tttagaatgc tttctggctt tgttactttt 97S 

tatccacagt agaagttctc agagaatgat ctccctcttt taatttaact ttttggcaca 97S 

gtattttgag aattataaat aatattagaa tgttttctgg ctgggtgtgg tggctcatgc 98C 

ctgtaatcct ggctacttgg gaggctgagg caggagaatc acttgaacat gggaggcaga 983 

ggttgcagtg agccgaggtc atgccactgc actccagcct gggtgacaga gcaagactct 983 

gtctgggaaa aaaaaaaaaa aaaaaaagag tgttttcttt cctattttcc accacttgat 98: 

taagttactt ttcctcttaa gtattttttg ctgagtatgc tgacttaaga gtaatgttac 98; 

aaaatttaat ttttaaagtt ctctgaaagc ccctttatga gagttttagg ctatcaaatt 98: 

gtgtttaatt cttaacaatt ttttgaaaaa ttatagcttc aatatccgta cattccccac 96< 

aaaaaagcac taaaaatcat gccttgctgg aggctgcagg accaagtcat gttgcaatca 98< 

atgccatttc tgccaacatg gactcctttt caagtagcag gacagccaca cttaagaagc 98£ 

agccaagcca catggaggcc gctcattttg gtgacctggg taagtaacta tcatttttta 985 

ttaacttgta ttagaaggat ttgagtacaa tatgtgaaac ttctgtcata ggatacagaa 98< 

ctatataatt ggaaagtgct ttggaaaaaa tgtatttaaa ataacagcta caagtataat 98' 
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gagggctgca 
gatcagtacc 
tagacttttt 
caagcactca 
tgctactagc 
ggcatttctt 
caagaaaaga 
ttaagctgct 
ttactgtttt 
atcttttatt 
atctcggctc 
aagtagctgg 
agagatgggg 
acccgccttg 
acaaattatt 
gaataagtaa 
tagtactatg 
ctagttactc 
gtagttttta 
taacttgtcc 
gtgactacat 
tcacttcctt 
ccattgttct 
ttatagaatg 
aaggcaaaga 
ggcagtccac 
tactctctct 
ttatttccct 
tttttttttt 
cgcgatcttg 
ctccccagca 
ttagtagaga 
tccgcccgcc 
gtaagtaggc 
caggctggag 
cgattctcct 
gctaattttt 
actcctgacc 
gagccaccat 
agtgattaca 
tgattagttc 
gtctgttaac 
gaagtatggt 
atattttgtt 
ctttttaaag 
gggaaaatta 
tccataaaat 
attgcttgca 
aacttaggca 
atttcttatg 
cttgattggc 
tctgactgta 
actctaaact 
catttgttga 
tctgtgctta 
cttaggataa 
gaaagacaaa 
tatagaacac 
aagacttagg 
agtaattcca 



aacaccagtg 
tgcaacgatc 
gaaggcaaaa 
acagttgatg 
ctttcacctt 
attttgataa 
gggttaaaaa 
gggcaccttc 
atgtagtaag 
tttttagacg 
actgcaacct 
gcttacaggt 
tttcaccatg 
gcatcccaaa 
atactctgag 
taagatatga 
gatgacacat 
ttctaagccc 
ttattgacct 
aagatcacat 
tgtctctcca 
agagaagcct 
agaactttga 
gatctctgac 
acaagagaag 
tgcattaaaa 
tttggcatta 
ctttaacctg 
tttttttgga 
gctcactgca 
gctgggacta 
cagagtttca 
tcggcctccc 
tttttttacc 
tgcagtggtg 
gcctcagcct 
gtatttttag 
tcaagtgatc 
gcctggccat 
gtgaaccagt 
actttattta 
tgtaggaaag 
tgattacagt 
tattttttgt 
atgctgtatt 
taatggaaat 
ctgtagattg 
ctgagtattt 
agttgtaaaa 
taaagccagt 
agtgataaag 
gtagaaggaa 
ttgccactaa 
atggggggat 
tactctgatt 
tacagaagca 
cttatacaca 
ttcttctgtt 
tcctgccctc 
cagatggctt 



accctacaac 
ttatttattt 
tcattgcctt 
tattgacttc 
tcaatgttta 
ccttttcctg 
tagtcttatg 
ctatcttgga 
gaagcaatta 
gagtctctct 
ccgcctcctg 
gtccgccacc 
ttggccaggc 
gtgctggaat 
tgttagaggc 
tacacaacca 
ggtaataata 
cttacaggta 
aattttataa 
ggcttataag 
taagcaggtt 
ttgctgacaa 
atactcatct 
tgcaaaagtt 
acctcagatg 
agtgatgtgc 
atcatttctg 
tataatgctt 
gacggagtct 
agctctgtct 
caggtgcacg 
ccatgttagc 
aaagtgctgg 
ttaattttat 
ccatctcggc 
cccgagtagc 
tagagacagg 
cactcgcctt 
aagtaggctt 
gcccttctta 
tttagtaaga 
cactgacact 
ttttactttt 
tgggatcgat 
actcactgat 
tgtttccata 
ctacttattt 
attctatatg 
ttagggaaaa 
tagattccac 
gcttaaagcc 
ggaacttaga 
cagtatgacc 
tggcctagct 
atgaagtaca 
gtacaagaaa 
gaacagtagc 
ctgggcatta 
attgaactta 
agcctagatt 



atttttaaat 
ttttcatgtt 
ttctgagctg 
atccagatac 
agagcacaaa 
gtaagatttt 
tcagatcctg 
tgtcatatta 
tatgattatt 
ttgtctccca 
ggttcaagca 
acacccagct 
tggtcttgag 
tacaggcgtg 
ttaggatgtt 
aagacctttc 
ttggttagca 
tatatttttt 
atcaagaaaa 
tggtagagcc 
caactctttt 
ctaccctcct 
tagaataaag 
ggtcataatt 
tgaagtccat 
tttgcatttg 
ccttatttta 
taacatctag 
tgctctgtta 
cccgggttca 
gcgccacgcc 
cagtatggtc 
gattacaagc 
ttttttgaga 
tcactgcagc 
tgggattaca 
gtttcaccgt 
ggcctcccaa 
ttactgagcc 
ttaatcacac 
caaaaaatga 
tataaggctt 
ttatttgaat 
acattgtcct 
tttatttgtc 
gtcattacat 
agatttttcc 
ctcaatttgc 
aagtaaggta 
attagttcaa 
cttctcaagc 
tgtttcaggc 
ttgggaagtt 
aatttctaaa 
taatctgtgc 
cagcccctca 
aaatagacca 
gacaaaaact 
cagattagta 
ggtagtgatg 



gtcgtcttcc 
agtctccaca 
ggagcatgtc 
tctgagggcg 
tacagagatg 
ttaatgttga 
tgatagaatt 
gcttatctac 
ttacagacaa 
ggctggagta 
attctctgcc 
cattgttttg 
ctactgacct 
agccaccgtg 
ttcacttgat 
ttcactatgc 
tttgtcctca 
ttcatcaata 
ttaagaccca 
agaatttgac 
gactggatgc 
gtgccctcct 
ctggtctaat 
atctttttat 
taaaggtaag 
tgagttcttt 
taattactta 
catataataa 



cccaggctgg 
caccattctc 
tggctaattt 
tcgatctcct 
gtgagccacc 
tggagtcttg 
atccacctcc 
ggtggccgcc 
gttggccagg 
agtcctggga 
ttgtgtgtat 
atttaattgt 
agaatactct 
agttttctgt 
gaacaacctt 
tgtttataga 
cagtgtacag 
attaatttca 
ttcaaatgtt 
tggagaagaa 
ccttacagcc 
actgccttct 
agagacctgt 
agtgagaaca 
gtaactttct 
tctctactgg 
ttaacattca 
agatgtttgc 
aaataataat 
gactataacg 
ggggagagga 
gaagtaaaga 



tgacagcagt 
cacttgaatg 
tggcacatac 
agttatttcc 
ggcacgtttt 
aaaaaaaaaa 
cacacttggc 
agcagaattt 
attattcttt 
cagtgtcgcg 
tcagcctccc 
tatttttagt 
caggtgatcc 
cctggcccag 
gctatgggag 
ttctagtagc 
atttactgtg 
atcctctaag 
gagaagtaag 
cccagatgtt 
tgttccaagg 
ccaaggctgt 
ttttacagtg 
gttctagtga 
ttctgccctt 
aatcctgtta 
tgattttgat 
gtaggctttt 
agtgcagtgg 
ctgcctcagc 
tttgtatttt 
gaccttgtga 
gcacccggcc 
ctcttatccc 
cgggttcaag 
accatgccca 
ccagtctcaa 
ttacaggcgt 
tggctatcct 
tccctaaaag 
taactgagca 
catttatcca 
aatttaaaat 
ttagagcatg 
agattgaagt 
tcaatttatt 
tttatgttgt 
gactaattat 
tagtttactt 
ttgagcaaaa 
aaagactaga 
ccagtcttcc 
tcagattctt 
gctaaaaaat 
ctgacttatc 
agtctggtta 
agctgccatt 
gtgaacaaaa 
acattaatca 
gatgtgaacg 
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gacttgaaaa 

tgtgagagag 

aacaggaaat 

tttatatatg 

ccaaaggtta 

gaaatggctg 

gtcttcatct 

aataatatcc 

aggtcctcca 

agtcaaactg 

ctgaagttta 

agaaagattt 

gatatattgg 

atgaaggtat 

ccactgtagg 

tctggaagag 

ttgcactgag 

gaaaagtcaa 

tttgagtttt 

tgatgtttta 

ggaatgctta 

acagatacag 

aaacagtcca 

tttgttgttg 

cagtggcgag 

gcctcagcct 

tttgtatttt 

gcctcgtgat 

cacccggtcc 

agtttgaaaa 

agaaagaata 

ctgacacact 

gaggaaatta 

tggtaaaaat 

acaaatgagt 

tcaccattcc 

taaataagct 

taacatcgaa 

attgaatgac 

gactactaaa 

tcgagatgga 

ctgcaacctt 

gattacaggc 

tt caeca tgt 

cctcccaaag 

ctaggttccc 

aaggtttaag 

tatccacagt 

gtattttgag 

ctgtaatcct 

ggttgcagtg 

gtctgggaaa 

taagttactt 

aaaatttaat 

gtgtttaatt 

aaaaaagcac 

atgecattte 

agccaagcca 

ttaacttgta 

ctatataatt 



aaaattcgga 

agggatattt 

tatatgttca 

ttatatatgt 

gagagtgatt 

tttactagct 

ctaaaatgag 

aaaaaaaaaa 

ttggtagcta 

agtgaagcac 

gggaagactt 

gaatgagaga 

gcaaaggggc 

agaaggctgt 

tttttgagca 

atttgeagga 

ggtgaggtgt 

caaatcttgg 

ggagactggt 

agttgatatt 

gtagcgagta 

atcagggctt 

tgtggagggt 

ttgttgtttt 

atcgcgcagc 

cccaagtagc 

tagtagagat 

ctgcctgcct 

cttgttaagt 

ttattgetet 

aaagaagaac 

gggtgttgat 

ggagagtata 

aagttactag 

tagtagcatt 

tatagactct 

aaaaattatt 

aaaatagttt 

tgaaattatc 

aatatatttt 

gtttcactcc 

cgcctcccag 

acctgccccc 

tggtcaggct 

tactgegatt 

cccaccccaa 

gatttaaaaa 

agaagttctc 

aattataaat 

ggctacttgg 

agecgaggtc 

aaaaaaaaaa 

ttcctcttaa 

ttttaaagtt 

cttaacaatt 

taaaaatcat 

tgccaacatg 

catggaggee 

ttagaaggat 

ggaaagtgct 



ggcaaaatgg 
aagattgata 
gttttgttat 
tattttatat 
ttttagagtc 
cattaaccta 
gaaaatatgg 
aaggtacata 
tcattatcta 
tgcaaggaat 
catggcaatg 
ttctttctct 
cttcagacaa 
aaagaccaga 
agatattgat 
tggagacccc 
taaaaataaa 
caagtaaaca 
ggattgaaca 
tagacagatt 
atcagtgata 
tttcatctgc 
gtgagtaaga 
ttctgagaca 
tcactgcaac 
tgggattaca 
ggggtttcac 
cagcctccaa 
ttattttggt 
ggtaafcaatt 
agatagcetc 
aaatgggtat 
ataccatgga 
ttctaagaga 
ttacattata 
tacttgtact 
ttgctccaat 
gaagacagtc 
tttattctga 
atgaattttt 
cgttgctcag 
attcaagcaa 
acacccagct 
ggtcttgaac 
gcaggcatga 
gcatttattc 
taatcegtat 
agagaatgat 
aatattagaa 
gaggctgagg 
atgccactgc 
aaaaaaagag 
gtattttttg 
ctctgaaagc 
ttttgaaaaa 
gecttgetgg 
gactcctttt 
gctcattttg 
ttgagtacaa 
ttggaaaaaa 



atagaagttt 
cctaccttct 
gttgggtggg 
gcatagtaat 
tagcaaacct 
gggcaaagta 
tcttacaaga 
gagaaactcg 
gttttaacat 
tcagaggaat 
acacttacct 
tctccctacc 
gtagagggag 
aaagagaatt 
gctgtaagta 
ggaagttttt 
caggtaagta 
gataacagtg 
gacagggaaa 
gtgettgaga 
caagaccaaa 
tccacagagg 
tgtttccctt 
gattctcget 
ctctgcctct 
ggcttgtgcc 
catgttggcc 
aagttctggg 
gggaagcaaa 
aaagatttga 
aagaagggga 
taaaagaatg 
gaccaagaaa 
gatgttaaga 
tacatctaat 
tgtctgaaca 
ttctcatgaa 
actcttcatt 
agecaaaggg 
agtgtgcttt 
gctggagggc 
ttctcctgcc 
aattttttgt 
tcctgacctc 
gccaccatgc 
tgcaatttta 
tttagaatgc 
ctccctcttt 
tgttttctgg 
caggagaatc 
actccagcct 
tgttttcttt 
ctgagtatgc 
ccctttatga 
ttatagcttc 
aggctgeagg 
caagtagcag 
gtgacctggg 
tatgtgaaac 
tgtatttaaa 



attattgatt 

ggcttgecta 

aggtgctttt 

tttaaggtct 

aagttgaaat 

ttcaacttgt 

ttgtcctgag 

tatagtgect 

agecttcagt 

ttgagatcaa 

tgtataaaag 

agcccagctt 

atttttacag 

gagacagagg 

tggtgtttat 

ttgttataat 

aatgtttaaa 

aaaaagaatg 

ttgagaggag 

tggtaaagtc 

gcccaggtca 

tgtaccctag 

gaatttgeca 

ctgttgccca 

egggttcgag 

accaagccca 

agactggtct 

attacaggcg 

ggaggtttca 

gagtaaatat 
gecaaagaag 
agagcaatga 
gatagactat 
gggaccgggg 
taagaaacaa 
cgaaaactgg 
aataaaaata 
ttgtaattcc 
gtgatactga 
atcttttttt 
agtggtgcaa 
tcggtctccc 
atttttagta 
aggtgatcca 
ctggcctgag 
gttttgttcc 
tttctggctt 
taatttaact 
ctgggtgtgg 
acttgaacat 
gggtgacaga 
cctattttcc 
tgacttaaga 
gagttttagg 
aatatccgta 
accaagtcat 
gacagccaca 
taagtaacta 
ttctgtcata 
ataacagcta 



aaatatgagg 
acagaaccaa 
gagtcattca 
gagttttaaa 
cctgcctgtt 
tttcattttt 
agatagatga 
ggtatatagt 
ttgttgaatt 
caaatgattt 
ttgaagaata 
cttatttgag 
aaagattgag 
aagcaggaag 
gaaaggttag 
acagaaagac 
catcttgaag 
ggaccaagat 
aatcagatga 
aatgtgggtg 
aagacaagtc 
gagctgttgc 
gaattacttt 
ggctggaggg 
tgattctcct 
gctaatttct 
cgaactcctg 
tgaaccactg 
gcttttaaaa 
gctttctagc 
caggctatat 
gcagatagaa 
caggaaggag 
aaagccttgt 
tgcgagagtc 
cttttgttta 
aaccttcttt 
cacaactatt 
tatttcttca 
gttttttttt 
tctcagctca 
aagtagctgg 
gagacagggt 
cccaccttgg 
gaatattttt 
taaagcaagc 
tgttactttt 
ttttggcaca 
tggctcatgc 
gggaggcaga 
gcaagactct 
accacttgat 
gtaatgttac 
ctatcaaatt 
cattccccac 
gttgeaatea 
cttaagaagc 
tcatttttta 
ggatacagaa 
caagtataat 
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gggtagctgt 
gcatttccag 
tttattactg 
atatttttta 
tttagtgata 
atatgcatgt 
tgtgcaggtg 
tatgtatgta 
atggacaaca 
tggagtataa 
tgttgcccag 
gcacaagcca 
acacccagat 
gtctcaaact 
ccaacgtgaa 
gaaaatgtct 
ccgttatatt 
cagtatttct 
aaaaaaagat 
ctctactttt 
taatacatag 
cactcataaa 
tctttctttc 
gtacagtggc 
cctcagtctc 
tatttttttt 
aactcctgga 
tgagctactg 
tgtagacgga 
cgcctcccag 
gcctgccata 
ggtcaggctg 
gctgggatta 
gcaattacca 
ccttccaata 
ttcttcagca 
aggaacaaac 
gacaaaaacc 
aatagagaaa 
tgagaggggt 
agttgtatct 
acaagaatga 
tcagtatctt 
aagtgtacaa 
cccaggccgg 
ctggaattac 
atgtgaatct 
atcagaattt 
gactgaggat 
agaactcttt 
tgatcatgct 
gtggttataa 
catttttcaa 
gttcataaca 
gaattactgg 
cctccaaaaa 
tttcctgtgc 
aatgcatttt 
atatttttat 
agtctgggct 



gttgtgttcc 

aagaaatata 

agggaagtgc 

aagtttgaca 

atataatctg 

catttaggcc 

tgtatgtgtg 

tgtatgttga 

tataaatatc 

tatagccatt 

gctggagtgc 

ttctctcgcc 

aattttttaa 

cctggcctca 

ccaccacacc 

aaggcatgtt 

aaaataagtt 

tacccaaatt 

ccatgaacca 

gtgtatattc 

cagacaaaat 

ttgctgatga 

tttttttttt 

gtgattacaa 

ctgagtagct 

tttttttttt 

atcaagcgat 

tgcctggcct 

gtctcacagg 

gttcaagcta 

atgcctggct 

gttttgaact 

caggcatgag 

tatgacctag 

aaaacctgtg 

ggtgaatgaa 

tgttgataca 

agtccctaaa 

ttaagagaaa 

aagtgggtgt 

tggcagtgga 

gtatagataa 

agagtgatat 

gggatctcta 

tcttgaactc 

aggcgtgagc 

agcattatct 

cctcaagttt 

gaagacacga 

tgacaaattg 

atgaaagcca 

tttaaattta 

ggtacgatct 

tctttgtaga 

actgaaaata 

ggttttgcca 

ccttgttact 

atgttaattt 

tggccccttt 

gggcgcagtg 



tgtaaatata 

tctgatcact 

aaattaaaat 

atttgaatgt 

gtgaagactc 

actctttcta 

tgtgtgtgtg 

aggctattca 

tgttataggg 

tgtttctatt 

agtggtatga 

tcagcctcca 

ttttttgtag 

agcaattctc 

tggttcagtg 

taaatgtgag 

cttccaaaac 

tctgcactta 

atggacttct 

aaaccagagt 

gcatatagct 

gaatttaaaa 

tctttttgag 

ctcactgcag 

gggactatag 

gtagagacgg 

ccacttgcgt 

aggcagtttg 

ctggagtgca 

ttctcctgcc 

gatttttgta 

cctgacctca 

ccgtcatccc 

cagttgcact 

cacaaatgtt 

ctggttcatt 

tttaaccacc 

gactacatat 

tgaaaagatt 

agttataaaa 

tgcagaaatc 

aactggggaa 

tgtactatag 

ggtattatta 

ctgggctcta 

gaccatgcct 

catagaattt 

gtgatgttga 

cgtgcttcaa 

atgaaaccct 

atttttaaaa 

gttaaatata 

caaagctact 

tatatccaca 

atgcagtttg 

gtttacatcc 

gcttaataat 

gcttttctgg 

ggaactagta 

gctcacgcct 



gaatataaag 

aaatataaat 

aatcagttaa 

cagtgaagat 

tttggaaagc 

agacctagcc 

tgtgtgtgtg 

ttatagtatt 

aaataaccaa 

tatttatttt 

tcatggttca 

gagttactag 

agacagggtc 

ccacacaggc 

tagccattta 

aaaagcaagt 

aaaaacatat 

gaaaattgca 

aataaaatca 

gtcaatgtgt 

cagagagtaa 

tggtgcagat 

acagggtctc 

cctcaccctc 

gcatgcacca 

ggtttcgcca 

aggcctccca 

tttgtttgtt 

gtggcccaat 

tcagcctcct 

tatttagtag 

ggt gat cage 

tggctggtgg 

ctgtatttat 

catagcagct 

cataccatgg 

tggatgaata 

agtatgattc 

agtgtttgcc 

gtgcaacatg 

tcaatgtgat 

atctgaacaa 

etttgeaaga 

tttttttaga 

gtgatccgcc 

ggccctttca 

aattaaaaga 

caaagatgaa 

aaaaatgatt 

cagtcagttt 

aaattttttg 

agataaatga 

ctttaaccta 

attttccctc 

ctaagacttt 

tcatgaccag 

ttttgaaaaa 

gatttttaat 

tcataagttt 

gcaatcccag 



catgcccagt 

atatgaaaaa 

tgttctccta 

gcagggaaat 

aatttggaaa 

ctcagatatg 

tgtatatgta 

gtttgtgata 

attgtggtat 

cttgagacag 

ctgcagcctt 

gaetgeagge 

tcactatgtt 

ctcccaaagt 

gaaatctaaa 

cacagtatgc 

gcaggagacc 

tgtcatgttg 

gtcctgcttt 

ttgtggggca 

aattgtaagt 

gctctggaaa 

actctgttgc 

ctcaggttca 

ccacgcctgg 

tgtttcccag 

aagtgctggg 

tgtttgtttg 

ttttggctca 

gagtagctgg 

atatggggtt 

ccgcctcggc 

tttcttatga 

cccagataaa 

taatattgaa 

aataccattc 

tcaagggaat 

cgtttggata 

agatgttaga 

agggatcttt 

aaaattacaa 

gttagagtgt 

tgttaccatg 

gatggggttt 

tgccccagcc 

gtattgtatc 

aattgtaaac 

ctagttgaca 

tgaatatcaa 

tataagaatg 

tctttcctaa 

ttttttatta 

ctatgaatga 

aggataagtg 

gctatctgtt 

cgaatgagag 

aatctaattt 

gaggttgagt 

tttttcttaa 

cactttggga 



agaaaaacaa 
gatgtctcac 
acacattagc 
acccctccta 
tcagtataaa 
ctcattcata 
tgtatgtatg 
gcaaaaaatt 
acgcatgctc 
ggttttactc 
cacctcctgg 
atgtgtcacc 
gectaagctg 
gctgggatta 
aaagacgtgg 
atggtaaaat 
tttattttgt 
tcataagttg 
tgacatctct 
cacttagcaa 
tttgetagat 
acaggcagtt 
gcaggctgga 
ggtgatcctc 
ctaatttttg 
gctggtctca 
attaegggeg 
tttatttatt 
ctgcaacctc 
gatgacaggt 
tcaccatgtt 
ctcccaaagt 
cgtgaaacat 
tgaaaactta 
aaactggatg 
agcaataaaa 
tatgetgtea 
atattcttga 
gacagggagg 
gtgatgttga 
agaactaaaa 
tgtatcactg 
ggagaaacta 
cactatgttc 
tcctaaagta 
ttagaacttc 
ctcacagaag 
ctgacagtaa 
tggattaaga 
cccatcttta 
caattagctt 
agtttagttt 
ataatgetga 
cctacaagtg 
ectgaatget 
tgttgcctat 
gacagacaaa 
atagttttta 
gaatttatgt 
ggccgaggtg 



98760 
98820 
98880 
98940 
99000 
99060 
99120 
99180 
99240 
99300 
99360 
99420 
99480 
99540 
99600 
99660 
99720 
99780 
99840 
99900 
99960 
100020 
100080 
100140 
100200 
100260 
100320 
100380 
100440 
100500 
100560 
100620 
100680 
100740 
100800 
100860 
100920 
100980 
101040 
101100 
101160 
101220 
101280 
101340 
101400 
101460 
101520 
101580 
101640 
101700 
101760 
101820 
101880 
101940 
102000 
102060 
102120 
102180 
102240 
102300 
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ggtggattgc 
tctactaaaa 
taggaggctg 
cgagatcgcg 
aaaaaaaaaa 
attgagtaac 
agaaaccaaa 
tttattcaag 
aaactatttg 
ggactaccag 
cgacactatt 
ggtgaaattt 
acacagtcta 
gcatgaaact 
ctcagcacag 
cactcagaat 
aatggccaga 
tacagtagcc 
gaaaagtgag 
tactaagcag 
tacttgtaaa 
gagcaggagt 
aatgtcctac 
ttcctccctc 
aaaattctta 
ggattttatt 
ttttgagatg 
actgcaacct 
aggcgcctgc 
catgttcgcc 
aagtgctggg 
agaaaaccag 
ctagaagcat 
atactcttaa 
aatgttcttt 
tcagtgaact 
ttcctccctc 
gcgcccgcca 
tggtcaggct 
tgctgggatt 
actgtaagct 
gagaagattg 
gctcaaacaa 
tcttattcct 
ttaaatgtat 
atttctttgt 
gtgaacatgt 
gttacttcag 
tctggtacca 
caggtttata 
ctggagtatt 
tctgaattcc 
tttctgcctc 
gagagaattc 
tgtttctgta 
caatgacgta 
ctcagccccc 
atttttagaa 
gtgatctgcc 
tttattgttt 



cgaaggtcag 
gtacaaaaac 
agtcaagaga 
ccattgctct 
aaaaaagaat 
aaataacttt 
agcatagtat 
gtctctggta 
ttgctgaact 
actcaagaga 
gtcctccctt 
tggttagagg 
aacacagtga 
acagcgtctt 
ttgtttatga 
cacttgctgc 
gcaggaactc 
agtagaaata 
tatgtgattt 
aacttcagat 
tttgggagaa 
gactggacct 
tt ttcccctc 
ggtcttaatt 
aaaattgcca 
tattagtcac 
gagtcttgct 
ccgggttcat 
caccacaccc 
aggatggtct 
attacaggca 
ggcttagaaa 
tttgacaaga 
ttatcacctg 
gtgtcttaaa 
aaaatgaggt 
cctccctcct 
ccactcctgg 
gatcttgaac 
acaggcatga 
gggagaagtg 
cttgagccca 
agaaaaaaag 
ttcacccttc 
atttgtctga 
ttcttcggat 
ttcttggact 
gtgttttgca 
cttaaaactg 
cttactgtag 
aatatgctct 
agaatactac 
ccactatttt 
agtattggga 
attgtttttt 
ctctcagctc 
tgagtagctg 
gagatggggt 
cacctcagcc 
ttagaaactg 



gagtttgaga 
tagctcagcg 
atcgcttgaa 
ccagcctagg 
ttacatggtc 
ttaataattt 
ttgtagtttt 
ccagttgttg 
gctaattctt 
ccaaatcaag 
acttcattca 
ctgaaagttt 
agcagagctc 
ttttaactga 
ctcattcaga 
tttcccagga 
accaagtttc 
gtcccgcttc 
tcttgtgtgt 
gaggaataaa 
tttggagagt 
tctaagaagt 
cactgatttt 
ttattaatat 
ataagtgaca 
aagacctttg 
ctgtcgccca 
gccattctcc 
ggctaatttt 
cgatctcctg 
tgagccaccg 
ggttaggtaa 
gcacctgttt 
ggattttgat 
gggctaagtg 
ctaatctgct 
tccttccctc 
ctaattttta 
tcctgacctc 
atcaccacac 
gcacacactt 
ggagttttga 
ttattgaatt 
attcccactt 
taattctgct 
tcagactgtt 
tttgtctgtg 
ttttcttttg 
aatttttgtc 
aaatatggtg 
ctgttaaact 
tggccccaaa 
ccttagttta 
agagtttcta 
ttttgagatg 
actgcaacct 
ggattacagg 
ttcgccatgt 
tcccaaagtg 
tctttgcttt 



ccatcctgac 

tggtggcggg 

cccgggaggt 

caacaagagt 

tgaattgcca 

aggcaagttt 

tttatttact 

ctaaaagtga 

ttgcttctat 

cctttctaag 

attcatggaa 

tcattcaaca 

actggctgag 

ttctcttgat 

aggaattgac 

atgtgacagt 

catggaaacc 

tccactaaaa 

acatatgtgt 

atgattggaa 

gtagtagagt 

gtgttatcag 

gacatcaaac 

tttactgcac 

tttattaagt 

tgcaggtagt 

ggctggagtg 

tgcctcagcc 

tttgtatttt 

actttgtgat 

cgcccggact 

cttcctctag 

ttttttcttc 

tagacagcct 

atttcttcag 

actgaatcaa 

aaccaggctc 

tattttagta 

aagtgaccca 

ctgacggcat 

gtactcccag 

gaccaacctg 

ttttatttct 

ttgatcccat 

atctacagtt 

ggtggcttgt 

ggaattctct 

ccatgcacct 

ttgggtgctc 

tttgattatg 

taatgtgttg 

tgtttaagat 

acacaaactc 

acctgtttct 

gagtctcact 

ccacctcccg 

tgcccaccac 

tggccaggct 

ctaggattat 

agtggtaatt 



caacatggtg 
tgcctgtaat 
ggaggttggt 
gaaaagtctc 
ttaaaagaga 
tggacgattg 
ttagttgcta 
ttgactaatc 
cttttaggca 
acccttgaac 
cttcggcgaa 
acttggtcgc 
cctgtctctc 
aagagattgg 
ctgaataata 
gcccattctc 
caagaatctt 
gaattgtcag 
ctcactttct 
tatttttttt 
cagatcagtg 
aattagtaaa 
cattatccac 
tttgcagata 
tcagtgctta 
aggcatgatt 
caatggcgcg 
tcccaaatag 
tagtagagac 
ccgcctgcct 
gattatctta 
gttgtacagt 
tctattagtt 
tcatgttctt 
atcttttagt 
gttttcagca 
ccgaggagct 
gagacggggt 
cctgcctcgg 
gttattttca 
ctactcagga 
ggcaacacag 
atggatcatt 
cttttattta 
ttttgtggac 
gattttagtg 
gtgtactctg 
ggggcctggg 
gtactgatcc 
gggtattgtc 
tccctgtaaa 
aagggcactg 
acctttttaa 
ggaaatggaa 
ctgtcaccca 
ggttcaagcg 
catgcctggc 
ggtcttgaac 
gtttctgtaa 
ttcaataaaa 



aaaccgaatc 

cccagctact 

tgcattgagc 

aaaaaaaaaa 

tatgagaatt 

tactttgttt 

ggaagtaaac 

tgtcaatctg 

gatcttgtct 

aagtcttgca 

tggagcattt 

gaataagagc 

catctaaaaa 

aggattctgg 

gaactaacag 

tccgtcttga 

cctctacact 

gaaaactaat 

ttttttaatt 

ctcctctaac 

tatggaaaag 

tgaagggtca 

atagccttat 

aaatttttaa 

gtgtatattt 

atcttttttt 

gtctcggctc 

ctgggactac 

ggggtttcac 

cggcctccca 

tttacacatg 

aaatgtggac 

tagaaattat 

tttcatctta 

tcactcattc 

tgttatttcc 

gggattacag 

ttcaccatgt 

cctcccaaag 

tcgcaaagtt 

agcttaaggt 

caagacccca 

ttttgtagtt 

tttagtttta 

ctgactcagc 

atttttggcc 

tataaattaa 

tcactaccct 

tgtatgagta 

ccagatggtg 

actccaaaat 

cctgtatttg 

aaaacatttt 

gtccaaagtc 

ggctggagtg 

attctcttgc 

tgatttttgt 

tcctgacttt 

ttgtaataca 

atagaaatag 
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103860 
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104100 
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104220 
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104340 
104400 
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104580 
104640 
104700 
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cagtggagtt 
tatatatagt 
agattcctgt 
gtcaagcccc 
tttgcctttt 
ccgagctgga 
gcagttctcc 
ggctgatttt 
caaactcttg 
cgtgagccac 
tttttcttgt 
gccatgaaga 
gcttatgcct 
gatcgagacc 
aaaaaaatta 
gcaggagaat 
cactccagcc 
ttttacactt 
ttacgtcaga 
ccatttgttg 
gtctaggcct 
taataccaca 
ataaaacgaa 
gtagaattgg 
ggcctagggt 
gtgcagtggt 
cgcctcagcc 
tgtattttta 
ctcaagtgat 
tgcctggcag 
tttaatagat 
tttatggcct 
tatagcattt 
ttgtttttct 
ttttgatttt 
tttctattat 
taaagtagaa 
ctgtaaattt 
ctgagcacaa 
ttagaaatat 
catttatttc 
ttaattttta 
aaagattgta 
taatgatggt 
ttattactca 
agctctatct 
aggatgatat 
aattttcctt 
ttcatttgat 
ttataattct 
tttatttatt 
aagctggagt 
gattctcctg 
tgattttttg 
ctcctgacct 
agccactgca 
gttggtaaat 
tgttgcactg 
actcataata 
ttctccctct 



attaaaagag 

aagtttgacc 

aactgtcacc 

acctctatcc 

ctcttttttt 

gtgcagtgag 

tgccttagcc 

tttgtatttt 

acctcaagtg 

tgtgcccaat 

atggattgtg 

ttttctcata 

gtaatctcag 

atcctggcta 

gccgggcgtg 

agtgtgaacc 

tgggcaacac 

tacgtttaga 

ttttttgttt 

aaaagacaac 

gtttttggac 

tggtcttaat 

ttgggaagtt 

tgtcatttct 

tttgtttttt 

gagatcttgg 

tcccaaatag 

atgcagacag 

tagcccacct 

gggcctaggg 

ataggactat 

ttgagtaatt 

cgggtttgta 

ttgtcagatt 

tctgttgttt 

ttctgcttgc 

acttagattt 

ccttctaacc 

atgaaatgtt 

gttatttagt 

tcatttcatt 

aaaataacat 

cattctgttt 

gttcagtttt 

gaagagtgtt 

ggttttgctt 

cttctgggtg 

gttctaagat 

tagtgcttgc 

atttaaaggg 

tatttattta 

gcagtggtgc 

cctcagcctc 

tatttttagt 

caggtgatcc 

cccggctgag 

ttaattattt 

gggtatttat 

atattaatat 

ttgatttccc 



cattagttac 
tttttaaaat 
actataaggg 
caacacttgg 
ttcttatttt 
gcaatctcgg 
tccctagtag 
tagtagaaat 
atccacctgc 
caggactttt 
ccttcagagt 
tgtttccttt 
cactttgaga 
atgcggtgaa 
gtggcgggca 
cgggaggtgg 
agtgagactc 
tatatatctt 
tttgtttatt 
ctttactcca 
tcctttttct 
tactgtatag 
tttattttta 
tctttagata 
gtgtgtgaga 
cttactgcaa 
ctgggattac 
ggtttcacca 
tggcctccca 
ttttcttttt 
ttaagttatc 
aattgtattg 
gtggtatccc 
gtatagggat 
tgttttcaat 
tttgggttta 
ctggtttgag 
actgctttag 
ctaatttccc 
ttgcaagcaa 
atattatggt 
taaaaaattt 
ttggacagtt 
tctttattct 
gaactttcca 
catgtatttt 
aattgcctgt 
cagaaatatc 
atggcatatc 
ggcttcttgt 
tttatttatt 
aatcctggct 
ccaagtagct 
agaaacggat 
acctgctttg 
tcatgttatt 
taatataaat 
aatgtgtaaa 
ctttggattt 
cttttttgct 



atttttccct ttttcattat cttcaaatat 



gtatacttgt 
taaagaacag 
caaccgctga 
tttttttgag 
ctcactgcaa 

ct 999 a ttat 
ggggtttcac 
ctcggcctcc 
tttttttaaa 
cacacctaag 
taaaagtatt 
agctgaggtg 
accccatctc 
cctgtagtcc 
agcttgcagt 
catctcaaaa 
ttttgagtta 
tttacatatg 
ttgaattgcc 
gtttcatgat 
taagtcttaa 
ctcttatttc 
tttggttgaa 
cagagtctca 
cctctgcctc 
aagcgtgtgc 
tgttagccaa 
aagtgttagg 
cagagtattt 
tgtttcttct 
aattgtcaaa 
tcttttattc 
ttattagtct 
tttattgatt 
ttttactctt 
acctttcttt 
ttacaccccc 
ttgaatctta 
ttggagactt 
cagagaatat 
tttaaaatgt 
ttctataaat 
tgctgatact 
actacaattt 
gaggctctgt 
tttatcatta 
tgttgtccaa 
tttttccatt 
aggcagcata 
tattgagaca 
taccacaacc 
gggattacag 
tttcaccatg 
gcctcccaaa 
tttaatcttt 
tttagtataa 
tataattatt 
agattaccag 
tttttttttt 



atcagtttta 
ttagttcctt 
tctttctccg 
acagcgtctt 
cctccgcctc 
aggcacgcac 
catgttggcc 
caaagtgctg 
tttacattca 
agccctttgc 
gtggttggcc 
ggcagattac 
tactaaaaat 
cagctacttg 
gagccgagat 
aaaaaaaaaa 
atgtcgtata 
gatgtctagt 
tttgtacttt 
gtgtgtgtct 
aattgggtaa 
cattttctag 
ttgggaagtg 
cttctgtcac 
ccaggttcaa 
caccatgccc 
gctggtctcg 
attatagatg 
taaactatga 
tgagtgaatt 
tttatgagcg 
ctggtgttgg 
tttcaaagaa 
ttctgctctt 
ttttttttct 
tctaagataa 
acaaattctg 
ttcttttacc 
ttttcctgtt 
attttgaatg 
gaatatacca 
gtcaagttga 
ttgtatgcag 
ttttttccaa 
tgttaggtgt 
tgtaattccc 
tttatataga 
tttttacttt 
tagttgggta 
gagttttgct 
tccacctcct 
gcacgcgcac 
ttagccaggc 
gtgctgggat 
tctcacaata 
ttatttacat 
ggtattaata 
tttagtatat 
ttttaattct 



acacatacat 
cacctttgaa 
tctcaatagc 
gctctgtcgc 
ctgggttcaa 
caccacaccc 
aggctggtct 
ggattacagg 
acttgtcatt 
ctaagcaaag 
aggtgccatg 
gaggtcagga 
acaaaaaaaa 
agaggttgag 
cgcgccactg 
agtattatgg 
agtatgaggg 
tgttctaata 
tgccatattt 
attcctttgt 
tgctggcctt 
aagagattgt 
atgccatctg 
ccaggttgga 
gttatcctcc 
gactaatttt 
aacttgtgac 
tgagccaccg 
attcagatta 
tttactgtag 
tgtaattatt 
caattgtgtc 
ctagcttttg 
tattatttct 
ccaagttgct 
gcatttaata 
gtattttgaa 
aatgaattat 
atttttctac 
atttcattta 
catacagtat 
tttagttggt 
ttatatcact 
ttttactttc 
gtacacattc 
tctttatggt 
cactgcagct 
tgatctacct 
gtgttattta 
cttgttgccc 
gggttgcagt 
catgcctggc 
tcgtcttgaa 
tacaggcgtg 
cagggttttt 
taaatgtaac 
taattatatt 
gtttttctgt 
tatttttttt 



105960 
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106080 
106140 
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106800 
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107220 
107280 
107340 
107400 
107460 
107520 
107580 
107640 
107700 
107760 
107820 
107880 
107940 
108000 
108060 
106120 
108180 
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108300 
108360 
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108480 
108540 
108600 
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108840 
108900 
108960 
109020 
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109140 
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109260 
109320 
109380 
109440 
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tagtatttgt 
acaatagttg 
cagaggaccc 
gtgatgactc 
ccacccttaa 
gggggtaagg 
aaaatggagt 
tttcttttcc 
cgttctcaat 
cctcacttcc 
ggccgggcgg 
gaccccccac 
cctcccagat 
cggctgccgg 
tcctcacttc 
cggtcgggca 
tcccatccca 
cgggcagaga 
ttcccagacg 
cagagacgct 
ctttgggagg 
actgcactcc 
ggcacctcgg 
gccaacacag 
gtgcgcctgc 
cagtgagccg 
agagggagac 
ggattatttg 
gattctcctg 
taattttttt 
actcttgacc 
gagccaccat 
tatgtaggct 
tcacatttaa 
gaactaggga 
cgtttttttt 
tggagtgcag 
ctcctgcctc 
tttttgtatt 
accttgtgat 
cgcccggcta 
attttagtaa 
cttctttaaa 
aattcactgg 
aattttgttt 
ttgctcctga 
tagctactta 
ttgtgagata 
gacagtttga 
agtttttgct 
ttctgtcggc 
tctgtgtcca 
aagcagggcc 
ttttcatggg 
cctttatccc 
gcacacttcc 
gtttctcttg 
gatgctgcct 
tccactctct 
ctgtcttgga 



tgatcattct 
agggaaggtc 
tgcggccttc 
ttaacgagca 
tccatttaac 
ttatagatta 
ctcccatgtc 
ccacatttcc 
gagctgttgg 
cagatggggc 
aggcgccccc 
ctccctcccg 

ggggcggctg 

gctgaggggc 
tcagacgggg 
gagacactcc 
gacggggcgg 
cactcctcac 
gggcggccgg 
cctcacttcc 
ccaaggcagg 
agcctgggca 
gaggccgagg 
cgaaaccccg 
aatcccaggc 
agatggcggc 
cggggagagg 
aatttttcct 
ccacagctcc 
gtatttttag 
tcaagtgatc 
gccctgcctt 
ttttagtggc 
tattttgtaa 
tgaggttaaa 
tttttttctc 
tggcgtgatc 
agcctactga 
tttagtagag 
tcgctctcct 
agtctttaaa 
cccaaatgtt 
tttttttttc 
ttctttcccg 
attatgtttt 
aactcttatt 
aaatttgttt 
atgggatttt 
cttacgttac 
ttaacaagca 
tctggttcca 
gtctgggacc 
atgcacaccc 
ctccttcttt 
tttcctgttg 
tgtagctatg 
tcagaaagta 
gggagtcgaa 
ttgatccgtg 
acttcttctt 



tgggtgtttc 
agcagataaa 
tgcagtgttt 
tgctgccttc 
cctgagtggt 
acagcatccc 
tacttctttc 
cccttttcta 
gtacacctcc 
agccgggcag 
cacctccctc 
gacggggcgg 
gccgggcggg 
tcctcacttc 
cggccgggca 
tcagttccca 
cggggcagag 
ttcctagacg 
tcagaggggc 
cggacggggt 
cggctgggaa 
acattgagca 
caggc agate 
tctccaccaa 
actctgeagg 
agtacagtcc 
gagagggaga 
taaatttatt 
caagtagctg 
tagagacagg 
cacctgcctc 
tttctagaat 
ttctctagga 
cttcaagtgg 
aaagagagag 
tttttttttt 
ttggctcact 
gtagctggga 
aeggtttcac 
cagcctccca 
tatttttttg 
agttttgtta 
ctgttgttca 
ttatttccat 
ttagttctaa 
tgtttcagga 
tatcatccca 
ctggttcttt 
atgattctga 
gttgacctag 
tcatcagttc 
tggccaatgg 
agctcacgag 
tctgtgatgt 
tctggctaga 
tcaacctctg 
ggattcttgg 
ggagagaaag 
agagccccct 
tgtgcatctg 



ttggagaggg 
catgtgaaca 
gtgtccctgg 
aagcatctgt 
aatagcacat 
aaggcagaag 
tacacagaca 
ttcgacaaaa 
cagaeggggt 
aggcgccccc 
ccggatgggg 
ctggccgggc 
ggctgccccc 
geagaceggg 
gagacgctcc 
gaeggggteg 
gtggtcccca 
ggatggcagc 
tcctcacatc 
ggcggccggg 
gtggaggttg 
ttgagtgagc 
actcgcggtc 
aaaatgcaaa 
ctgaggcagg 
agcctcggct 
cgagggagag 
tatcttactt 
ggactgeagg 
gtttcaccat 
ggcctcccaa 
ttatatattg 
attacaatat 
aatgtagaaa 
aaaagaaatg 
gagacagagt 
gcaacctccg 
ttacaggtgc 
tgtgttggcc 
aagtgctggg 
acattgeact 
ttgtttggca 
gcttcgaaaa 
tctgttattg 
aattttcttt 
gtgatcttat 
gcatatgtgt 
atatgacaat 
atcttgttta 
ttaggttcag 
agttttgtat 
tcaggtccca 
tggccccggg 
ccctgacacg 
aagtcagggc 
tggccacgac 
agctgctgtc 
gaacaaaaca 
ttcctgttcc 
gtgtgcagtt 



ggatttggca 
aggtctctgg 
gtacttgaga 
ttaacaaagc 
gtttcagaga 
aatttttctt 
cagtaacaat 
ctgccatcgt 
ggcagctggg 
cacctcccag 
cggctggccg 

gggggctgac 

cacctccctc 
cggctgccgg 
tcacctccca 
cggccgggca 
catctcagac 
egggaagagg 
ccagacgatg 
cagaggctgc 
tagggagctg 
gagactccgt 
aggagctgga 
aaccagtcag 
agaatcaggc 
ttcacaactt 
cccctttttt 
atttatttat 
catgtgccac 
attggecagg 
agtgctggga 
agttcttgat 
acatactttt 
acttaaccac 
taataaagat 
ctctctttct 
cctcctgggt 
gcgccaccat 
aggatggtct 
attacaggcg 
ttttctcttt 
ggttcctgag 
tttctattca 
agtctttgta 
ttttgtgtat 
ttcttagagc 
cctcttgatt 
taattttgga 
aatcctgtgg 
tccacaaatt 
ettatctget 
aagcctttgt 
agtgcacata 
ttctgccttc 
tttagattcc 
ttcttcttct 
attgetgetg 
aaacaaccca 
tcagaccaga 
tcagcttttg 



gggtcatagg 
ttttcctaga 
ttagggagtg 
acatcttgea 
gcagggggtt 
agtacagaac 
ctgatctctc 
catcatggcc 
cagaggggct 
aeggggcagt 
ggeggggget 
cccccacctc 
ccggacgggg 
geggagggge 
gatggggtgg 
gaggcgctcc 
gatgggctgc 
tgctcctcac 
ggeggctagg 
aatcteggea 
agatcacgcc 
ctgcaatcct 
gaccagcccg 
gtgtggcggc 
agggaggttg 

tggtggcatc 

gctttctttt 
ttttttgagt 
tacacccagc 
ctggtcttga 
ttacaggcgt 
tgtatctttt 
cacagtgtac 
cataaaaata 
ttaataacac 
gttaccaggc 
tcaagtgttt 
gcccagctaa 
cgatttcttg 
tgagccaccg 
tccttctagg 
gctttcctta 
tctgtcttca 
gtgaatttta 
gtcttatact 
atggttttag 
gtcttttctc 
ttgtatcttg 
aaaatattga 
ctaagcagca 
tatgtgcctt 
acacttttag 
caactcgacg 
taagaacctc 
ctatacttca 
tgggactgea 
tggctgetet 
ggggatttcc 
aatagagggc 
agtccaggcc 



109560 
109620 
109680 
109740 
109800 
109860 
109920 
109980 
110040 
110100 
110160 
110220 
110280 
110340 
110400 
110460 
110520 
110580 
110640 
110700 
110760 
110820 
110880 
110940 
111000 
111060 
111120 
111180 
111240 
111300 
111360 
111420 
111480 
111540 
111600 
111660 
111720 
111780 
111840 
111900 
111960 
112020 
112080 
112140 
112200 
112260 
112320 
112380 
112440 
112500 
112560 
112620 
112680 
112740 
112800 
112860 
112920 
112980 
113040 
113100 
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aggaggtgct 

tcagtccacc 

tgggagagac 

aatcaacgga 

aaaacaacac 

tttttttttt 

ttggtaaagc 

ttattcttcc 

ttttctcctt 

tgagtttatt 

ttttccatag 

aagcaccttt 

tattcatgaa 

tcttcccttg 

gatgctgcta 

tgcttgaaac 

attgactaag 

ctgagtattt 

cgagggtctt 

gctgagcagc 

ccggcagccc 

cagtgaggct 

cacgaccagt 

gggcttcatc 

attagccaga 

acacagtgtc 

atgtccagtc 

aaaaaaaaaa 

cctgaataac 

tcccagacaa 

tgtgggataa 

tttgtccagg 

accttaactg 

tatgccctca 

atatatatat 

tgtattagta 

atatacatat 

gagtagtaag 

tatgctgatt 

ctgtgtttta 

ttcattcaat 

aaagttagta 

ccgatgttgg 

cagcaagacc 

tagctattct 

tacctataac 

tatttaaggg 

tttctgtgtt 

gcttttccta 

ttggttggct 

tgtggataaa 

aactctcttt 

aagtcttctg 

atcccagtgt 

tcatttttcc 

gttggactat 

tagctcttac 

ggtgttgcta 

tgcactacct 

aggtacagct 



ggacaaactt 

tgcttccaag 

agaagagtgt 

tgatattctc 

tttaacaccc 

tttgagacgg 

atcaataatt 

attttagcag 

cttagtgcct 

ttggtgtaaa 

ctttgaacac 

gtgatggaag 

tataattcat 

caggtataga 

aaccaatacc 

tatggcaaaa 

ataatttttt 

tttatatcgg 

tgcctgggcc 

cgggccggcg 

cgcaccagtg 

aaaagccggt 

gggcaactgg 

ccacttctca 

gccatcacat 

atttctgtgt 

ccagtttcac 

aaaaaaaaaa 

aatgacagca 

tattccaagc 

atacagttat 

gtaacagctc 

ctgtgctgtg 

ccccctgcca 

atatatatat 

tatatgcata 

tagtgtgtgt 

gacaaacatt 

caacaaatat 

tcaggaagac 

aaatattaca 

tttgtgattt 

gcttctttaa 

cctgttagtc 

ggggtccatg 

ggctataaca 

gtactgtttc 

gttgctctat 

gtaatacaac 

agtgattttt 

aaggaagcag 

cttttactta 

aaggatcaag 

ctatcattat 

tccttgagcc 

tttaatatag 

tgtgtaccca 

actaattact 

ccttcatctt 

gacagaattt 



gtcaggagta 

tccttggatg 

gcttatttca 

tatattaatt 

gcctttggtg 

agtctcactc 

ttatctttca 

aattcatgtt 

cagagtagat 

agtactttga 

ccccatgtaa 

tttattttgc 

tactggagtc 

acaagatgca 

aattacagaa 

aaaaaatgac 

cttaacatgg 

attatagctc 

agatgggctg 

ggcggctacg 

cacgaagtgg 

accaaagtct 

atggccagac 

gtgggcctga 

ggcctgtgac 

catttggcac 

gtaactttat 

agtttttctt 

agatcaataa 

actttttatg 

tttatagatg 

agatatggca 

gcagtgtttt 

aaaaaaaaaa 

aatatatata 

tatagtatat 

atatatatat 

tcagaaaaat 

atttcttata 

cttaggtgaa 

ttctcataag 

atgaaataag 

tccttagtgt 

tcagctgtgt 

tcatgttggc 

taggcctggt 

actgagtttt 

ttttatgtgg 

agggatgttc 

ttttgagggg 

tttcaagtca 

agcttaatca 

ttgataacat 

atattttagg 

ccattcttaa 

ctgtccttca 

ctttgcatag 

gtttttatgt 

ccacaaatgt 

gctgatggtt 



cggaggtact 
catttgtcca 
tcttgacata 
tgctgttttc 
gtttctgtca 
tgtcctttga 
tccacacaag 
gctccaatag 
cctgttcaga 
aattcatgca 
ctctcctctt 
aataggaact 
caagttgctt 
gtgaatactt 
gcaatgagaa 
aaaaaatgca 
aatttagcag 
actttaaaag 
cagtgtagcg 
ctaaccggca 
gcgggacaga 
ctaggcatca 
aggtgtctca 
cgtccctggg 
ttgccttttt 
agctggaggt 
tcttctgaat 
atatgttgga 
atagtacaca 
gatagactca 
aagaaactga 
gagtcaggat 
tcatactgta 
aaaaaaaaaa 
tatataaaat 
attatatatt 
atactagaat 
gttttcatta 
ggttatagca 
cgtatattca 
tcctaatatt 
acatgttctt 
gggtgctttg 
ttcttaaatt 
tccattttcc 
ggctgttggt 
gctgacagat 
gaatttgcta 
tgactgatta 
agtctgtacc 
aataaaacac 
aattaatgat 
tttgtgatca 
atgttaatta 
tcctgtccaa 
agtgagtttt 
tcttgtttta 
gaggatttag 
ttgaagtggt 
tggaagtgag 



gcaagttctg 

ttgttttgag 

cttattagga 

cctttagcaa 

taattattaa 

ggcattgtcc 

cttcaccata 

gggctgtctt 

tacgttataa 

tagttttttc 

ccacaaacca 

cacagtgatc 

tttggttttt 

ttaccaaata 

atgacatcat 

cagaactgac 

ttcccttcct 

tttctcggct 

ggtgctcagg 

cagaccaccg 

aacttctggg 

gggctgcagc 

gtggtggcct 

caccctggat 

ttgccagttg 

gcaaggagga 

aaagacaatt 

cccaaattct 

tttattaaac 

ttttaacttc 

agcacagaga 

ttgaaactag 

ggttgggacc 

aaatatatat 

atatatatat 

agtatatata 

aaaaaaatca 

tatatacatg 

aaatagtttg 

cagataaaag 

atgtattttt 

gcacttttag 

cactcactca 

ggcccactgt 

ttttctttct 

ggcttatccc 

gttgtcatga 

ctatcatcat 

gagtttgcct 

agttaatagc 

ttaaaatgaa 

gatgtaatcc 

aagaatttga 

cctgtgtggc 

attatttgtc 

gttcaaagga 

aatgtaatcc 

agtgatccag 

agaattttta 

tggtatgaga 



attacttttc 
ttgcattcca 
tttcatatca 
gcacattagg 
tacttgactt 
ccataaactt 
aatttgatgt 
caaactgatg 
caggttaata 
atcatatgca 
aacaatgaaa 
taagccctgc 
gaagttctct 
tatatctcca 
aggtaagcag 
aattttcgtt 
aatttgtttt 
gcattcggtg 
cctgcccgct 
gatggactgg 
gttggaagtc 
ccaagagtct 
ctccgtctca 
gtctacctgc 
attgtgccac 
gggcagcctc 
tgctaacctt 
taggctttaa 
actcactgtg 
taaagaactt 
agttaagtgc 
accctcacat 
agccttctct 
atatatatat 
ataaaatata 
ctaatatata 
aagtatctca 
tatgtatgtg 
aaagctttta 
aggttattta 
attcttcaaa 
cagatctgtc 
ctgctgggga 
accttccagt 
cccacacaga 
tatctgcttg 
gatttgaggt 
ccctagacca 
gtttgaagaa 
ctgactggcg 
accacactgc 
catgaaggaa 
gaaaacctct 
tttaggcaag 
tcctcttgca 
gccttcactt 
ttggattttt 
aatctatact 
aaaactttga 
gggaaaaaaa 



113160 

113220 

113260 

113340 

113400 

113460 

113520 

113580 

113640 

113700 

113760 

113820 

113880 

113940 

114000 

114060 

114120 

114180 

114240 

114300 

114360 

114420 

114480 

114540 

114600 

114660 

114720 

114780 

114840 

114900 

114960 

115020 

115080 

115140 

115200 

115260 

115320 

115380 

115440 

115500 

115560 

115620 

115680 

115740 

115800 

115860 

115920 

115980 

116040 

116100 

116160 

116220 

116280 

116340 

116400 

116460 

116520 

116580 

116640 

116700 
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ggaataaagc atgactgcat tttttgtttg tttgtttgtt tgtttttgag acggagtctc 116760 

actctcgcca ggctggagtg cagtggcgtg atcttggctc acggcaacct ccgcctcctg 116820 

ggttcaagcg attcccctgc ctcagcctcc caagtagctg ggactacagg cgctcgccac 116880 

cacgcctggc taattttttt ttttgtattt tagtagaaac ggggtttcac cgtgttggcc 116 940 

aggatggtct ccatctcctg acctcatgat ctactcacct tggcctccca aagtgctgag 117000 

gttacaggca tatatataag catataaagt gtgttatagc atacaaacag gtatatatat 117060 

aaacatgcag tccacacagc tgataggaat gaggcagtag tgaaggagaa gttgatgtag 117120 

gagaggggac agttgttaca ggaaagaagt ctggaggcag aagggatgaa ttccagtgct 117180 

cacatagaag attgcttaga tgggagcaag gacaatttat ctagagtcac aggaaagaat 117240 

gcagtacacg ggtagagatg caggtgagtt gaaagatgtg agagatgatg gaaataattt 1173 00 

tctgattgct tctatattct caaggaagca ggaagcaaag tcctcagcaa agagaataga 117360 

agaggtgtta aatatttgag aaaggagatg tactgtagaa aaaaaaaaaa ctcagtttct 117420 

ccttetgaac tctcacaaaa cagaaccctt ccatgactct agttgtgtgg ggttttttcc 1174 80 

ctgtcagcta ccaattctgc agatgattgt tcagtgaaca ccaactgggt gtcctctaag 11754 0 

tcagttcagt tctcacactg tttacctgga gatagcatca gatcccacag attgaggact 117600 

ctgtcccaca agactgcctc cacttcagat gccagtctca agtacaagtt gtggcctgtg 117660 

cttctgactg accttctata aattggagtt cccacagtcc cctccttggg ttcaataaat 117720 

ttgctagagc agctctcaga actcagggaa atgctttaca tatatttacc catttattat 117780 

aaaggatatt acaaaggata cagattgaac aggcagatgg aagagatgca tgggcaaggt 117840 

atgggagagg ggcacagagc ttccatgcac tctccaggtc atgccaccct ccaagaacct 117900 

ctacagattt agctattcag aagcccccct ccccattctg tccttttggg ttttttgtgg 117960 

agacttcatt atataggcat gattgatcat tggctattgg tgatcagctc aaccttcagc 118020 

cccctcatcc cgggaggttg gtgggtaggg ctgaaagtcc caaacgtgta attctgcctt 118080 

ggtctctctg gtgattagcc ctcatcctaa agctctttag aggccacagc cacaagtcat 118140 

ctcattagcc ttcaaaagaa tccagagatt ccatgaattt taggcgctgt atgctaagaa 118200 

actggctaaa ggccagttgc aatgtctcag gcctgtaatc ccagcacttt gggaggctga 118260 

ggcaggagga tcgtttcagg ccatgagatc aaaaccagcc tggtcaacat agtgagaccc 118320 

ccttacaaaa aatttaaaaa ttggccaggc gtaatagctc ttgtctgtag tctcagctac 118380 

tcagaaggct gaggatcact gagccctgga gttgaaggca gcagtgagcc atgatcgtgc 118440 

cactgactcc ggcttgggtg acaaagtgag accttgtctc agaagaaaaa ggaaaaaaaa 118500 

aaaactgggc aaagactaaa taacatattt cacagtatca cagatttgta ttgtctagga 118560 

aagtgaatgt aaacagacca ggacactagt atgatccctt ggtttcatga aggtcccact 118620 

aaagtcatga acacaaagtg agactaggca tcatgttata tggtttttcc agccatgttt 118680 

aacagctagc taaatagcta attgtttcgc tgcagtttat tttagcagtt ccttatttta 118740 

gcacatttca tgttttaaaa tttctaccaa taacatttta ataaactttt ttacagataa 118800 

cttcacaaat ccataatttt ttaagttaca atcccagaaa tagaattgct cattgaaagg 118860 

gtatgttcat ttttaaagtt atgctagaaa ctgccaaatt gccttcagaa aaaggtgttt 118920 

gtatccccac taacactagt gttagttttc ttgtgccctt gctcaagtat acatattatt 118980 

aaaaacaatg ttgggccagt ttactagata aaaggtgtag tgcctcctta ttctaatcta 119040 

tttgattact agtgagtatg tatgtctttt cacgttggtc attttatgtt tgttcctttg 119100 

tggattgtca tgtcctttgc tcatttttct tttggaacat ttcttagtag tttataagag 119160 

ctcttggtat tttaatgata gtaacctttt aactgtcatg catgctgcaa atcttttttc 119220 

tgtttgtttg cctttgtatt ttgtttttgg agggtttcta tgtataggaa ttaaatttta 119280 

tgttgttaaa tcttttgatt tctgcttttg catatgtact tcaaaagact ttctatttta 119340 

agatcaagtg ttacctgtat tttcttttag ttctatttaa aacctcttaa tttatatgcc 119400 

tgtgctgtta actcccaagt tgattcacaa gtgtgtatac atagtttgaa tttagtggca 119460 

atttaattat ttacaacttc ttttgcagca aggatttgtg gagaagatgg acaggtggat 119520 

cccaactgtt tcgttttggc acagtccata gtctttagtg caatggagca agagtaagtt 119580 

agttcatatt ttcacattgt gcatcctagg gaatttgggt tcattgttag gaatgggctt 119640 

cactcagcta aaaacaaagt atttttgaga atttaaatat tttggatatt tacaagatca 119700 

tataaagcat actctatctt ggttaacagt ttcttttaaa tataaattat gtgaactctt 119760 

aaaattttca ttttcatttt caatgttaat atttcctaag ttaaaataat ttgtttttag 119820 

ttctgaaata atttggggag tgattgagtc tgtagtgatt atgactatta gaattggttt 119880 

atttatttaa ataatgcatg tcttcagatg gctctcctaa tttgttagtt aggctttaag 119940 

ctaaatggat gctatataac taaatccaca tagatttgtt gaaatggctc cagaggtttt 120000 

ttagatttat tactgctatg tgcccttaaa aaaaatctat tcattctttc acttaacatt 120060 

tatcagaaga gtgctctgtg taagacgtgg ttaggcatag tgccagtctt gaaggaagtt 120120 

acagcctaat aaaagacata gggcatgttg tttggttact gtaatatgaa gtggcatgtg 120180 

ttaaatgtca ggggagaact acaaagtcat aaaaaggtgg gagagattac atacaggtaa 120240 

aggaatcagg aatgacacca tggggagtaa ggtagtgttg acctaggcct ttaagataca 1203 00 
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atagggacag 

tcaaattttc 

tctgtaaata 

tctgtgagtc 

attcgagcat 

gaagagccaa 

agtaatttaa 

ttgagaaacc 

tattttaaaa 

agaagagaaa 

ataggattgc 

gcatacagtt 

aagcaaatga 

ctcgattctg 

tctttgctat 

gaaatgatca 

aggaaccaca 

gctgtgtagg 

acttttcaat 

tttgtttttt 

agaaatctgt 

agaggatgca 

tgctgccaaa 

tgacaagtga 

tttgtggaaa 

cataaaagct 

ctgctgggaa 

gtgtttccca 

agagaaggca 

ggggaaagtg 

atgaggatgc 

tatattatac 

ttcactgaaa 

aggcagatga 

tcttactctg 

acactttgag 

gcaacatggc 

cgcctgttgt 

tcgaggctgc 

accctgtctc 

aatgctaaag 

ctaatagtat 

tgcataggaa 

acgcctataa 

tcaagaccag 

caggcatgat 

ctcgagcccg 

gctgacagag 

ccggtggctc 

ggtcaggagt 

caaaaaatta 

gcaggagaat 

aaaggtttcc 

gaatctttat 

agtctaattt 

caaagtcaca 

tccagtgttt 

atagatgttg 

caagtcgtgg 

tacaaaaatg 



tatggaaaga 

ccttttgtcc 

ccagattgaa 

agccctcttt 

tccagcctct 

gtcatttcca 

taatatatta 

taattaaact 

ccacagaatt 

aaaagtagta 

ccagtgaaga 

tagtataatg 

ctattaagta 

cctctttaca 

aagtttaaat 

gatataaaat 

aatccagttt 

tagtgtttgt 

ctcaaatgac 

gcttatgacc 

ttatagtctt 

gtgaatatct 

aagggccaat 

gttatattga 

taaagatgaa 

gcttttatat 

aggggaggtg 

gaagagagat 

gaaaacaaca 

gcactttcaa 

tggtactaat 

tttctgtata 

atgagaggat 

tgagcttgtg 

taaaaaaagt 

aggcagaggc 

aaaacccgcc 

cctagctact 

agtgagctgt 

caaaaaaaaa 

gggtaacttg 

ctaggccagt 

tcacctgaga 

tcccagcact 

cctggccaac 

ggcacacacc 

ggaggtggag 

ggagactctg 

acacctgtaa 

ttgagaactg 

actgggcatt 

tgcttgaacc 

agtccccctg 

ttttagaaga 

tcctattttg 

gaacaagtta 

attctggtac 

agataggaag 

aggagtattg 

ctgatttaaa 



gtatattttt 
atgtgcaggc 
gtgctgacca 
tatttctctg 
aactatcaat 
aaaagatgta 
gagagaacat 
actgcatgta 
tgaaacttgc 
aattttttct 
agcatttgca 
ctctttgtta 
gaaagaggat 
agaatacagg 
caacaatttg 
atttggtttg 
agtataattt 
ttcttgttaa 
tgtaacttgc 
tgtatttcaa 
attttccctt 
tacaattctg 
atgatggaca 
tagatggatt 
taaactcagt 
ggtgtttgta 
tatgtggggt 
tttgtttgga 
ttctaggcaa 
gaaacttgag 
tggaatagat 
aatctgctca 
ggaaacatca 
gccagctctg 
aattcgtggt 
aggtgaatcg 
tttactaaaa 
taggggcctg 
gatccactgt 
aaaaacaaca 
gggatagaga 
ggttcctgaa 
gcttattaaa 

ttgggaggct 

atggtaaaac 
tgtaatccca 
gttgcagtga 
tctcagaaaa 
tcccagcact 
cctggccaac 
ttgacgggtg 
cgggaggcag 
tctcagaaat 
cataccagat 
caagtaagga 
gtggcagaat 
agtatgtttg 
agtttacatt 
accaacttac 
aggagagttt 



cccacttaaa ctctttcctt 



actttagtga 

gtggaactgt 

aggtaaagtc 

gctggggccc 

tcattgtttc 

gaaaattcaa 

agagagtgca 

ttccagtgca 

tatgctcatc 

acagacaatg 

ggcttcaaca 

tcccagtctc 

tactcagttg 

tttaggttaa 

gttagtttac 

ttactctagt 

cttttttttc 

tgacaggtgt 

tatttgagct 

gtgtattttt 

gttggcagca 

ggaggcacag 

cagcagatac 

ctctgttgtc 

aagctttggg 

aaacaggatg 

tcccaaagaa 

aggcattggc 

tttagataat 

tgtaagggac 

ggcacgttgt 

tacagtaaac 

taacgtatgg 

cgggcacggt 

cttgagccca 

atacaaaaat 

aggcagaagg 

actccaccct 

aaggtaattt 

aaagtccaca 

cattagtctg 

aataggtttt 

gaggcaggcg 

cccgtctcta 

gctactcagg 

gcggagatca 

aaaaaaaaaa 

ttgggaggcc 

atagtgaaac 

cctataatcc 

aggactgcat 

tctgattctg 

aattctgata 

aaataaggcc 

ttggactgga 

tagaaggtat 

tagaaatttg 

tcaatacaac 

tctttttttt 



gtttctgcga 

ttacctggct 

tgcatttctt 

tgtctatagg 

aagttgtttc 

tgtattaaat 

tgtttttaat 

taaattgcag 

atttttactt 

agtatattaa 

agtgaaatta 

acaaagcagt 

atttgttttc 

tatgtcctca 

tctttatatg 

tcactaaaag 

gtctaaaaga 

taacagaaga 

tatagattag 

tcttcctagt 

gataacttcc 

aatgatgcca 

ttattgaaca 

aaggagctca 

ggttcttaga 

gcaatggtgg 

agaagggaat 

ccagaagcca 

caaaggagtg 

cttgaatgcc 

taattagttt 

aaaattgaaa 

tattcttttc 

ggctcactcc 

ggaatttgag 

tagctgagcg 

atcacctgag 

gggcagggca 

gttatttgta 

gatgttaggg 

tgggctcttg 

caggctggtt 

gattacttga 

ctaaaaatac 

aggctgagga 

tgccactgca 

ataggttttc 

aaggcaggca 

cttgtctcta 

cagctactag 

ctcaaaaaaa 

caggtttgag 

aatagccagt 

cagagaggta 

atgcagttct 

tacgtaagaa 

gtctaaaatg 

ataggagatt 

cttctttttt 
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tttgctggca 
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agtagatctt 

agattgtgag 

acatggaaaa 

agtctcagct 

tgattttata 

tttgatatgt 

caggaggcag 

acaaaagttt 

tgttcaagga 

tttgctaccc 

tggaaacgta 

gggaataaat 

tatttatggg 

tttattagtt 

atatctggtc 

atttaacttt 

tgtaatcaca 

accagcctgg 

tgatggcgtg 

ccttgggagg 

gtagagtgag 

tccttaagca 

tttgaagaca 

ctgggctgtc 

gcggtggctc 

ggtcaggcgt 

aagaattagc 

aggagaattg 

ctccaggctg 

agtctgggta 

gatcacttga 

ctagaaacta 

ggaggctgag 
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gtgtgaccag 

ttagggatgt 
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atggagtctt 
cctccacctc 
aggtgggggc 
tgttggccag 
aagtgctggg 
tatgcatcat 
ttagagtgct 
catttcataa 
ttgtgttatc 
tccaaatagg 
atgtttactc 
aagttctcag 
ctatctatca 
tgcctcagtt 
actaaggatg 
agtgggaaca 
tgcaggctca 
atacttgcta 
ccaagccaca 
ctgcagggaa 
aaccatggag 
agaactcata 
gtccagcaaa 
ccacataaca 
gatcaaagct 
aaagaaaaaa 
taatggagct 
tttagagtgc 
tccttcagga 
gttactgccc 
gatgatcctg 
acaaatttaa 
atgaaaatat 
caaaaagcta 
aaaaaaattt 
acaataatgt 
acttccagtc 
tcttttatac 
taccattgca 
ccaaaagcaa 
tgtataagta 
attgattgat 
gtgcagttgc 
tgcctcatcc 
tgtattttag 
ttaagtgatc 
gcctgggcag 
atgcacgttt 
tattaacata 
ttattattac 
gcatgcacct 
gaattaaagg 
aagaccctgt 
ccacatatat 
acaaggtatt 
ctgtaatccc 
gaccagcctg 
gtggtggtgc 
acccgggagg 
caagagcaaa 



gctctgtcac 
ctgggttcaa 
caccacgccc 
gccggtcttg 
attataggcg 
cataatcatg 
tggaagagag 
tcagcttacc 
aatattggtt 
tcatatgcat 
ctgatgcctt 
ctcgcttttt 
agtcctcaac 
gtagtctgat 
gttggcaggc 
aagaaaggtt 
atcaagtagc 
tggaatttga 
catcctcttg 
ggtgggccac 
aaggtaaccc 
aaatataagg 
gtgattaaga 
ttttagtcaa 
ggtagtgcaa 
gaaaaactaa 
gaaaaatctc 
tccttctact 
ggtttccaga 
ctgaagacct 
accctgtgta 
aaagaaaaaa 
ttttgtacag 
aaaaaagtaa 
taaataaatt 
gctaggcctt 
ttgcaagctc 
tgtattttta 
atagtggcct 
taggttgtac 
cactctgtga 
tgattgattg 
acagtcttgg 
tcccaagtag 
tagagacagg 
tgcctgcttt 
taactgaaat 
acctcaaagt 
gtacctgaca 
gtatttttaa 
atagttccag 
ctgcagtgag 
cttgaacaat 
caccagtaac 
attaatagct 
agcactttgg 
accaacatgg 
atgcctgtaa 
cagaggttgc 
actccgtctc 



ccaggctaga 
gcggttctcc 
agctaatttt 
aactcctgac 
tgagccactg 
cattatcaac 
cctttttttt 
aaaacattac 
actccctttc 
ctagctcacc 
gtagttatga 
agggaaaatg 
agagaatagg 
ccttacagct 
agatagaaag 
acatcagcac 
cttgtataag 
ttttacttcg 
gatttgatga 
tccccaactg 
agaacttcaa 
tgggaaaacc 
gcagaggcct 
cagtggactg 
taataacaaa 
aaagataaaa 
tgttgcctca 
tactaagaaa 
aggaggcatt 
tccagtggga 
ggcttaggct 
aaaattaaaa 
ctgtatatgt 
aacagttaaa 
tagtgtagcc 
cacattcact 
cattcatggt 
ctgtgccttt 
acgatattca 
catatagcca 
tgttagcaca 
attgattgag 
cacactgcaa 
ctgggattac 
gtttcaccat 
ggcctccgaa 
tctctaatgc 
tactttgatg 
catggtaagc 
ataattagag 
ctactcagga 
ccgtgttcat 
taaagaaggc 
tgtcaacagg 
tattaataat 
gaggccgagg 
agaaacccca 
tcccagctac 
agtgagctga 
aaaaatataa 



gtgcagtgac 
tgcctcagcc 
tgtattttta 
ctcaagtgat 
tgcccagcct 
ctttgtattt 
tttctcgcat 
ctgcattata 
cacaccgagt 
cctcagtgct 
tgatgtgttc 
accatgtctt 
tacccataaa 
tttaaacaac 
gtagcaagtt 
tgtcatcaca 
attctctgga 
gatatctttt 
tgttgtacga 
tttcacaact 
acgtatcaaa 
aagcagaata 
tgagtctggc 
cgtgtacgat 
agttagaaaa 
gaataaccaa 
tatttactgt 
acagttaact 
gttatcaaag 
caagatgtgg 
aatgtgggtg 
atagaaaaaa 
ttgtgtttta 
aagttacagt 
taagtgtaca 
taccactcac 
aagtgcccta 
tctgtatttg 
ttatagtaac 
aggggtgtag 
atggcaagca 
acagagtttc 
cttctgcctc 
aggcaggcac 
tttggccagg 
agtgctggga 
cattttcctt 
attaaagtaa 
atcaaaaaat 
agcagtatca 
ggctgaagct 
gcccctgcac 
attatgccgc 
attggaaccc 
aaagcgttgg 
tgggtggatc 
tctctactaa 
ttaggaggct 
gatcgcacca 
ttataataaa 



acgatctcag 
tcctgagtag 
gtagagacag 
ccacccacca 
gcttgttttt 
ctgtcaggac 
ttaatgcttt 
ccccatcaag 
catcagtaag 
gttttgtttt 
ttattttatt 
cctttcctat 
tatgtgattg 
agtagagttc 
gacccaacta 
tagctctata 
ggaggtgctg 
taccataggt 
ttagaaattg 
ccattacgtc 
ctacaagaag 
gcacagtgga 
ctggtatgta 
ggtcctgtac 
aataaatttt 
gaacaaaaca 
actatacttt 
gtaaaacagc 
gagatgacgg 
aggtgaaaga 
tttgtcttag 
gcttataaaa 
agctgttatg 
aagctaattt 
gtgtaagtct 
tcgctgactc 
tacagatgta 
tgtttaaata 
atgtgataca 
taggccatac 
gcctaacgga 
actccattgt 
ccaggttcaa 
caccatacct 
ctgttctcga 
ttacaggcat 
atctgtaaag 
ggtaatgtat 
gttaactact 
aaaattagct 
ggaggattgc 
tccagccttg 
aacgttagct 
tagttttggg 
ctaggcacgg 
acctgaggtc 
aaatacaaaa 
gaggcaggaa 
ttgcactcca 
taaataaaag 



ctcactgcaa 
ctgggattac 
ggtttcacca 
ctgcctccca 
gtatcatata 
atagaaacca 
ttttggtatt 
gtagaaatct 
tcctgttcta 
gaatttgtac 
ctgtgcatac 
aaattccttt 
ttagtttctt 
accgtcaaga 
tctctgggga 
gttctaggcc 
aaagttgctt 
acttctccct 
aatccaatat 
aggcctggac 
ttttattggt 
aattgaagca 
cagtcacgtg 
gattataatg 
aataagtaaa 
aaaaaaatta 
taatcattat 
ttcagacagg 
ctccatgcgt 
aagtgttatt 
tttttaacaa 
taaggatata 
acaacagagt 
attattaaag 
acagtagtgt 
acccagagca 
ccatttttta 
cacaaattct 
ggtttgtagc 
catctaggtt 
aattctgttt 
ccaggctgga 
ccaattatcc 
ggctaatttt 
actcctgacc 
gagctaccat 
tgacgataat 
ataaaataca 
tttattacta 
gggcgtagtg 
atgagcctgg 
gtgacagagc 
tagaaatgat 
tattatgatc 
cgactcacat 
aggagtttga 
ttagccgggc 
aatctcttga 
gcctgggcaa 
taaagtattg 
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atgtttgtga 
gccagtgagg 
acgctattat 
tagaagtcaa 
tgatgataat 
atctgtatca 
aatgcaaaaa 
cactgtcaag 
actttgagta 
ctgctatcct 
gcctgtaatc 
aagaccagct 
ggtgtggtgg 
tgaacctggg 
ggacaaaagc 
tacgaacata 
ccagttacaa 
tcagtataat 
tttaagaatt 
attgaaattg 
taactcagca 
aatatatgca 
atcctatgga 
acaaatattt 
ttttttttta 
tcactgcaac 
ctgggattac 
gggtttcatc 
tcggcctcct 
taattttaat 
catttgttat 
taaaacagtt 
gtctttttgc 
cattcggttc 
gttggccctc 
actgattgcg 
acagatccta 
tggcctgtgc 
aggaattcga 
attagccggg 
gaatcgcttg 
agcctgggtg 
tagcctactt 
aaagtatcat 
aataaatcaa 
tatttaggat 
ttcttgagac 
ccacagcctc 
attttttttt 
ggcaggcaga 
tctctactaa 
cttgggaggc 
agatcacgcc 
tttttttaaa 
aatgccgcct 
gttggccagg 
agtgctggga 
tgatttcgtt 
tggagtgctt 
atcctaagac 



atgatttatt 
ttatgttgct 
ataataccat 
atcattaaat 
attttcacga 
tataaagaat 
tagttcttac 
accctggtag 
tttctgtgct 
ggagcttagt 
ctagcacttt 
tggccaacat 
cggacacctg 
agacagaggt 
gaaaatacgt 
aacatttaca 
acttttcctt 
taataattat 
atttgaaaaa 
ataatgttct 
atcacacgcc 
caaagacttt 
ctattttctg 
tgggggagaa 
gacagtcttg 
ctccatctcc 
aggcgctcac 
atgttggcca 
agagtgctga 
ttgtaaactg 
aggtagttaa 
ttagttggat 
ctggcttttt 
gaggagatga 
ctgatgagtc 
tctgccatta 
ttatttgtaa 
ctgtaatccc 
gaccagcctg 
catggtggca 
aacccaggag 
acaaagtgag 
actatcttct 
gctgtttcat 
gtaatatgga 
actttttgta 
agggtctcct 
ctgaatagct 
ggccaggcat 
tcacgaggtc 
aaatacaaaa 
tgaggcagga 
actgcactcc 
tgatggagtc 
gcttcagcct 
gtagtctcaa 
ttacaggcgt 
ttgcattacc 
tcatatgtta 
aagaaatcta 



cttctaatga 
tgtatgtgtc 
acataaaaac 
agctagtagt 
ttaaaattaa 
gtaaattttc 
tagatgtgtg 
ttaggtagga 
agatggtagt 
ctacaaaaaa 
ggaagatcga 
ggcgaaaccc 
taatcccagc 
tccagtgagt 
ctcaaaaaaa 
aattatactg 
cgtagaatta 
taaatgtaaa 
aaaacaatgt 
tttgaagagt 
tggtgagtta 
aacatttatc 
ctaaaaagta 
aacccaacaa 
ctccagcgtc 
caggttcaag 
caccatgcct 
ggctggtctt 
gattacaggt 
tacaaaggga 
catttgtaac 
ttgatttcaa 
gtccagcaat 
atttctgggc 
tcacccaggg 
gggagaaaag 
attttaagtt 
agcactttgg 
gccgacatgg 
ggcacctgta 
gcagaggttg 
actgtgtctc 
aatcaaagca 
ttaggccatt 
atatattcat 
aaataagtga 
cgctgcaacc 
gggactagag 
gatggttcac 
gggagatgga 
attagctggt 
gaatggcttc 
tgcatggtga 
ttgctgtgtt 
aagtttcttt 
actcctggct 
gaaccactac 
gtgccacatt 
aaccatacct 
aggaggcata 
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actagaggag 
atgtgtatcc 
tgaattttag 
aaacagaata 
accttttctg 
agggtaataa 
tatgtaagga 
aaaaagacat 
gttacagtgg 
ggtacatatt 
ggcgggtgga 
cgtctctact 
tactcgggag 
cgagatcatg 
caaaaacaaa 
aataagttct 
gaaatataaa 
taaaaacatc 
ggaaacagat 
aaagtgacca 
tcttaaggaa 
ataaaccaga 
ttaatatcaa 
aattacatgc 
caggctggag 
caattctcct 
agctaatttt 
gaactcctgg 
gtaagccact 
taatacttgt 
cagtagaatt 
ctttaaaata 
ctttattata 
gggaacgtgt 
agttctgaca 
catacacatc 
gtggaaaaaa 
gaggctgcgg 
tgaaacccca 
atcctagcta 
caatgaacca 
aaaaaaaaaa 
tttgtggtaa 
attctatttg 
agcctctgaa 
atgaattctt 
tggaaattct 
gcatgcacca 
gcctgtaatc 
gaccagcctg 
tatggtggct 
aaccagggag 
cagagtgaga 
gctcaggctg 
tttttttgta 
tcaagcagtc 
ctataatgtt 
gtgcatttcc 
gattctcctc 
aagaagttaa 



atttttccag 
aggtgaaaaa 
gaatactgaa 
gagtgtcagc 
attttaaagg 
aattaaaatg 
acttagacta 
gaatgattca 
taaacaaaat 
ggccgggcac 
tcacctgagg 
aaaaatacaa 
gctgaggcag 
ccactgcatt 
caacaaaggc 
catgtttatt 
taataaacat 
tatgtacaat 
attttgatat 
tatatattaa 
atcagtttga 
aaaatcgagt 
ctttatgtaa 
attgtaattt 
tgcagtggtg 
gcctcaggcc 
tatagttttt 
tctcaagtga 
gcacccagcc 
agtacaacaa 
ataggtaaaa 
atgcttttca 
aatatttgaa 
cgctgactgc 
gctctgcgtc 
ctttccttca 
aagataaaag 
tgggcggatc 
tctctactaa 
cttgggaggc 
aaatcacgcc 
aaaaaagaga 
cttaaaatat 
aatctgtggc 
gagctcttta 
aggtctcctt 
gggctcaaat 
ccacgcctgg 
ccagcacttt 
gccaacgtgg 
catgcctgta 
tcggaggttg 
ctccatctca 
gtcttgaacc 
aagagacagg 
ctcccacctt 
gtgtttcact 
ttgacctttt 
aaaatcacac 
ctggttttat 
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acttaattaa 
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attttaagaa 
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ggtggctcac 
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atttgcttgt 
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attgctagtg 
agttaaaatt 
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tactttcgtg 
tttttttttt 
caatctcggc 
tcccgagtag 
agtagagatg 
tccgtctgcc 
ttatgcatta 
gaagtaaaaa 
tttatttatt 
tctctatcag 
tgatctcatc 
tcctggctct 
tcaggtattg 
catcccagta 
ccaggcacag 
acacgaggtc 
aaatacaaaa 
tgaggcagga 
actgcactcc 
gaaataaaat 
actgtattgt 
tgtttctctt 
tgtaagtatt 
tttttttctt 
aatccaccca 
ctaatttgaa 
gggagaccga 
tgaaaccccg 
atcccagcta 
cagtgagccg 
aaaaaaattt 
cctgacctca 
gtcttgctat 
ggcctctcaa 
caaggccttt 
ttgggttttt 
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cagtaaatga 

agcataataa 

ggccttggat 

tctcctgagt 

tgccattttg 

gtttttgttg 

atggcacgat 

cagcttcctg 

ttttagtaga 

gatccacctg 

ctgaatgtct 

atttttatac 

caccttgaag 

cagcattatc 

caccagggta 

attttataaa 

acatttacat 

acatcgattt 

atcccccact 

tagatttgct 

tgccaactgt 

gcatgtatca 

attttatttg 

gtgtacaagt 

ttgctctgtc 

tcctgggttc 

accaccacgc 

tctcgaactc 

aggggtgagc 

tagaattgct 

cgaaaaggct 

catttcattg 

gttctacttc 

atttgcttat 

ttttattttt 

ggctggagta 

attctcctgc 

cgggctgatt 

caaactcctg 

cgtgagccat 

tgctcttgtt 

tcctgggttc 

accaccacac 

aggctggtct 

gattaccggc 

tcaaactcct 

gcgtgagcca 

gagtttcagg 

caggtatttt 

tgtttgtttg 

gcagtggcac 

cttcagcctc 

ctgtttgtag 

gtgattctcc 

ctgtcttttc 

ggcacccacc 

atgttcaccc 

aaagtgctgg 

gtattgtctt 

tcagcagaat 



tagagccaga 

ttttctaatt 

aaattttcct 

atctttttct 

gttttttgat 

ttgtcattgt 

cgtggctcac 

agtagctggg 

gaaggggttt 

cctcggcctc 

tgtttttgat 

cttttgttga 

aatgttcaca 

agcccctcta 

gctactatcc 

taaaaccata 

catacaattc 

tagaacattt 

ccaccagccc 

tattctggac 

ctttcactta 

gaattttatt 

aattgtaccc 

ttttgtgtgg 

gcccaggctg 

aagcagttct 

ccagctaatt 

ttgacctcaa 

cactatgccc 

gagtcaagag 

gcaccatttt 

gaacttatta 

agtgtggttt 

tggcctttgt 

atttttattt 

caatggtgtg 

ctcagcctcc 

tttgtatttt 

acctcaggtg 

tgggcccagc 

gcccaggctg 

aagcgatttt 

ccagctaact 

caactcctga 

atgagctacc 

gggttcaagt 

ccttgctcag 

agtcctttat 

cttcattctg 

tttgtttgtt 

aatcacagct 

ctgaatagct 

agacagatct 

cacctctgcc 

actattaata 

accatgcctg 

ggctggtctt 

gattacaggc 

ctaatttgtg 

tttgtagttt 



aatattcccc 
actgttgaca 
aatttgtaag 
tctgttaagt 
ataggttaga 
ttgagacagc 
tgcaacctcc 
attacaggca 
caccatgttg 
ccaaagtgct 
taggcactta 
tactatatat 
agcagaacta 
gaagccctct 
tgacttttga 
ctgtgtattc 
agtggttttt 
ttttcact cc 
taggcagcca 
atttcataaa 
gcatcatgtg 
cctcattatg 
tcctttctgc 
atacaggttt 
gagtgcagtg 
cctgcctcag 
ttttagtaga 
gtgatccacc 
ggctgtggtt 
gtaactctta 
gcaatcccac 
tctgtttggc 
ttgcacttct 
tctagctttg 
atttattttt 
gtctcagctc 
cgagtagctg 
tactagtgac 
atctgcctgc 
ctagattttc 
gagtgcaatg 
cctgcctcag 
tttgtatttt 
cctcaggtga 
aggcccagcc 
gatcctcctg 
cccctttgcc 
atattctaga 
tgagttgtct 
tttttaagat 
cactgccacc 
agggccatag 
tactgtgttg 
tcccagagtg 
gtgtcttcct 
gctaattttt 
gaactcctga 
gtgagccact 
aacatggatg 
tcagagtaga 



ttctagtgtt 
aataaataac 
agagtattat 
ttacctagga 
atgtcttggt 
atcttgctct 
acctcccggg 
tgtgcaacca 
gtcaggctgg 
gggattgcag 
agaaaggcct 
atagaaaact 
acccatgtga 
tgggcccctt 
tggcatagat 
ttttcttgta 
atatggtcac 
agatagaaac 
ctagtctact 
catggaaccg 
ttcaaaagag 
gccaaatatc 
catttatcaa 
tctttttgtt 
gcacaatctc 
cctcccgagt 
gatggggttt 
catctcggcc 
ttcatttctt 
aacttattga 
cagcagtgta 
tgtttttaaa 
ctgatgagta 
gaaaaatgtt 
ttttgagacc 
actgcaacct 
ggattacatt 

agggtttcac 

ctaggcttcc 
ttttttcttt 
gcacaatctt 
cctccccagt 
ttttagagac 
tccacctgcc 
aattttctca 
ccttggcctc 
catttttaaa 
taaatgtccc 
ttcctctacc 
aaggtctcat 
tcaacttcct 
atacacacta 
cccaagttgg 
ctgggattac 
gcttcagcct 
ttgcattttt 
cctcaggtga 
gcacccggcc 
tatcttcatg 
agcctttcac 



cttcaccatc 
cctttgaatt 
cgtattgcca 
gataaactgc 
tttttttttt 
gtcgcccagg 
ttcaagcaat 
cacctggcta 
tattgaactg 
gcatgagcca 
aggtactaac 
gcacttatca 
cccagcatcc 
ccattcactg 
tagcattacc 
cagctttatt 
agagttaggt 
cccctttact 
ttttatctct 
tatattatgt 
catcatgtta 
ccattgcaag 
taatgctact 
tttaaatttg 
ggctcactgc 
atctgggact 
caccatgttg 
tcccaaagtg 
ttgttgtata 
aaaactgcca 
tgagttttac 
aatgatagtc 
atgatgttga 
tattcaaatc 
aagtctcact 
ccgcctcctg 
tcaggcacct 
catgttagcc 
caaagtgctg 
ttttttttga 
ggctcactgc 
agctgggatt 
agggtttcac 
ttggcctccc 
ttatattgcc 
ccaaagtgtg 
ttagattgcc 
ttatcaaatt 
ttttaaaaaa 
tctgctgccc 
gggccgaagt 
tcacacccag 
tctcaaactc 
aggtgtgagc 
cccgagtagc 
agtagagaca 
ttcacctgcc 
aaaatattgc 
tatttatgtg 
ctccttgggt 



agcttaatgt 
ttcaatactg 
tttacaaagc 
tgagtatggt 
tttttttttg 
ctggagtgca 
tctcctgcct 
atttttgtgt 
ctgacctcat 
ctgcacctgg 
cataaaatat 
taaccttaga 
agatcaaaaa 
tccttcttgt 
tgttcttgtc 
gtgctaattc 
aaccattacc 
taaactccaa 
atagagacaa 
ggtcttttgt 
tccatgtttg 
gatttatgac 
gtgaccattt 
aggtggagtc 
aacctctgtc 
ataggcacgc 
gccagtctgg 
ctgggattac 
tacataggag 
gattgttttc 
agcttctcca 
attccaataa 
gcatcttttc 
ctttggccat 
ctgtcagcca 
tgttcaagtg 
gccagcatgc 
aggctggtca 
ggattacagg 
gaaggagtct 
aacctctgcc 
acaggtgcct 
catgttggcc 
gaagtgctgg 
caggctggtc 
gggagtacag 
tttttatatt 
atattatttc 
ggtgggtttt 
aggctggagt 
gatcctctta 
cttttttttt 
taggctcaaa 
cacacgcaac 
tgggattaca 
gtgtttcacc 
atggcctccc 
cttcttaaca 
ttctttcatt 
catttattcc 
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tatgttttaa 
agattgtttg 
tcaactttga 
tcaatatata 
cagagcaatg 
agggaaagct 
cttttttcag 
ttaaatcatg 
tatggatttt 
caaccttgca 
tgctggattc 
atttacactc 
attctcttcc 
ccatactctc 
gattttggta 
ctggtccctc 
tgagagccgc 
aattatttta 
tatt ttaaac 
tctctctgta 
cacaaaaata 
aggaguactt 
tgt catatgg 
tactacccaa 
tgtgagtaca 
Ctatacattt 
aggccagcgt 
gattctcatg 
ctaatttttt 
ctcttggcct 
ggtaaaagat 
gagactgagg 
gcgccattgc 
tttaatattg 
gttaagcatc 
aaaattttgt 
attttacaat 
tttgatgaag 
acatatgccg 
tcttttaaat 
actttcactt 
tcctgggttt 
gtttcttcat 
ggctgcataa 
tcattttagc 
ccacagatcc 
aggtgtgtaa 
gaaaaaggga 
cgcagtggct 
gtcaggagat 
aaaaattagc 
aggagaatgg 
ctccagcctg 
ttgggagact 
tgtatttgga 
cttttagaaa 
tgggaggccg 
tggtgaaacc 
gtagtcccag 
ttgcagtgag 



gttcttttcg 
atgagagagc 
taaatctgat 
agatcatgtc 
atgagtagaa 
ttcagtttca 
attcaggaat 
aaagggtgtt 
gggttttatt 
tacctgagat 
catttactgg 
tatcagaaat 
attccaaaga 
ttgattacta 
tctgagtaac 
tttctttttt 
tttccctacc 
tttacttaac 
aatgctgcag 
actgaatttt 
ccaattctgg 
gctgtactca 
cctcaccaag 
gatacatatt 
gcacatttgt 
tattttattt 
gcagtggtac 
tctcagcctc 
tatttttagt 
caagtgatcg 
gaatattgag 
tgggaggagt 
acttcaacct 
tgctttcctc 
ttaatagtga 
acaaatctgt 
ttctattgta 
cgataattgt 
ggtaagctta 
tttttaatta 
agagtgtgca 
taatcttggc 
ctgtaaaatg 
ctgatacttg 
ttgtaataaa 
aatcataaaa 
aggcttgatt 
tttcaacata 
cacacctgta 
tgagaccatc 
cgggcatggt 
cgtgagcccg 
cgcagtggag 
gaggagccta 
aagccagctt 
ataacggaca 
agacgggcgg 
ccgtctctac 
ctactccgga 
ccaagatcac 



attccattat 
atagaaatac 
tgttagctct 
atttatggat 
gtggcagaag 
tcatttaata 
ttccctatca 
gaatattgtc 
ctgttgatgt 
gaatctcact 
tattttgttg 
gaattgacca 
tagacataca 
ttgctttgta 
agtcctcata 
gtttaactgt 
ctcccacccc 
ctattactta 
tgaataatct 
tagaagtgga 
tttttcttgt 
gctggctagt 
aatcaaaaac 
tctggatgta 
tggaacttag 
tattttttag 
aatcttggct 
cagagtagct 
agaaactggg 
gcctgcctca 
ggctgcatgg 
cctggagccc 
aggaattata 
ttttagctat 
tgaggttgag 
cacattccaa 
gtccagtgtg 
ggatgcggca 
gctcatgcct 
aattttacat 
gtataatgtg 
tctgccattt 
agaataataa 
gaaaaagtat 
aagatagtga 
tcactttctc 
acactaccct 
tttaattact 
atcccagcac 
ctggctaaca 
ggcaggcacc 
ggaggcggag 
cgagactctt 
gaaagtactt 
tttcagctgt 
aggccgggca 
attacctgat 
taaaatacaa 
ggctgaggca 
accattgcac 



aaatagaatt 
aagtgatttt 
aatagttttc 
agagatagtt 
caaaaatctt 
tgatgttagg 
ttcctgattt 
atgttctttc 
gaaatattaa 
tggtcatggt 
aagattttgt 
taaatgtgag 
tccgtctgta 
ataagttttg 
gaattagttg 
gtatcttgga 
tgctatagag 
gtcggggaca 
tgtatataag 
atttctaggt 

ggaggtgggg 

cattttagaa 
attcctattt 
tgacagcttt 
gtcgttaaga 
tttttgatac 
cactgcgacc 
atggttacag 
tttcaccata 
gcctcccaaa 
tggctcatac 
aggagggtga 
ggcttcagtc 
agtatgaggt 
tgaaagttac 
gcccaggact 
aaaaaagcca 
agtctggatc 
agaattttta 
ttttttctaa 
gtggttaagt 
attggcagcc 
agtgaaaaga 
tcctttgagt 
ttcataggat 
ttccctaaag 
gatccgtacc 
ttcagtagaa 
tttgggaggc 
cgatgaaacc 
tgtagtccca 
cttgcagtga 
gtctcaaaaa 
gaaggaagta 
gtcagctttg 
cggtggctca 
ctcaggagtt 
aaagttagcc 
ggagaattgc 
tgcagcctgc 



gttttcttaa 
tacatgttga 
ttgtggattc 
ttttttctgg 
tgtcttgttt 
tgtgggtttt 
tttaaggctt 
tgtatcagta 
ttgattttca 
gtataatctt 
atctgaacgc 
agtgtatttg 
tgtctgtctt 
aaatcagaaa 
ggaaatattc 
gattgttcct 
aggtctataa 
ttaagcttgt 
tcattttcca 
caacctatgg 
agtaggaggt 
aggtttcctt 
accctgtaaa 
tcatattgaa 
atgtcttata 
agagtcttcc 
tccatctcct 
gcatgcacca 
ttgaccatgc 
gtgctgggat 
ctgtaatccc 
ggctgcagtg 
actgtgcccg 
tacatttcag 
ttctatttca 
gattgtttca 
gtattaaaat 
cagaatcttt 
caagtgtaaa 
tctattatta 
ataaaggctc 
gctaacctct 
tgccaacatc 
ttaagaatta 
atgccactta 
atagcttgat 
ccagttccca 
agtaacagtg 
cgaggtgggc 
ccgtctctac 
gctacttggg 
gcttagattg 
aaaagaaagt 
aaaggtttgt 
tgtagtgatt 
cgcctgtaat 
cgagaccagc 
gggcgtggtg 
ttgaacccgg 
gcgacagagt 



tttcattttc 
tcttgcaact 
tttaggattt 
ctagaactta 
cctatctgac 
caataaatgc 
tttttttttt 
taaatgatcc 
gatgttaaac 
ttcaatatgc 
ttaagataac 
tgggttcttg 
tatgccagta 
gtataaatga 
cctctttatt 
tctcaacaca 
gtgtctgttc 
ttatgtcttt 
tcaatataag 
ctctgtattt 
agaatgctgg 
agcttctttt 
catggggctt 
gaaataatgc 
aattcataca 
tctgtcgccc 
gggctcaagt 
ccatgcccgg 
tggcctcgaa 
ccttgtattg 
agcactttct 
agttgtgatc 
gcatgtacat 
agtcattgtt 
aacactgaag 
tatacttcta 
actgaaaaat 
atatcaacgg 
taactttgca 
tatgcccaga 
tggagtgact 
tggtatctca 
atttactctg 
agttggttat 
ctgaaattta 
taacatgtaa 
gcagcaccat 
gtaggccagg 
ggatcacgag 
taaaaataca 
aggctgagac 
tgccactgca 
aacagtggta 
ttgaccacat 
tttagttctt 
cccaccactt 
ctgggcaaca 
gcgtgtgcct 
gaggcggagg 
aagactctgt 
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ctcaaaaaat 
aaagatgatg 
aattgcgtga 
tgttttgttt 
gtgcaatctc 
cctcccaagt 
ttagtagaga 
tccacccgcc 
tgtatttttt 
gaaagattat 
ctcggcaggc 
ctactaaaaa 
atctttctga 
tctatgaaat 
ctgatatatt 
gacgtcaagt 
ccgcacctgg 
gttcatgtac 
gcacttttgt 
ttcattcaca 
actacactgt 
tcatagtgta 
agtttttccc 
aagaggcaga 
tcactatgtt 
ctcccaaagt 
gatagctctt 
taataagaat 
tttgggaggc 
ccctgtctct 
ctacccgaga 
cttagatcac 
aaaagaatac 
tttcttttgg 
tccttttcac 
tttttcagca 
ctaatcattg 
tctttttttt 
aatggtatga 
caattctcct 
gctaattttt 
actactggcc 
gagccacagt 
gtataaacag 
agagcaaggc 
ttttttctta 
aactgaggat 
atgttttatt 
tggctttcta 
gattgttaga 
tcagtgactt 
aaggtttgtg 
aagtcaaaga 
accttgtgtt 
tctcctctga 
aaaaagcatt 
gtttgagacc 
cagttcttgc 
tctgcttcct 
gtgcgcacca 



aataataaaa 
ttattcttaa 
tttttcactt 
tggttttttg 
agctctctgc 
agctgggact 
tggggtttca 
ttggcttccc 
aaatgtataa 
cttcaccagg 
ggctcacttg 
taaataaata 
aatttttctc 
aacttattag 
gatgttcaca 
gatccacctt 
cattattctt 
cattgcttat 
ttgcttattt 
gtgagttttg 
agtattccgt 
tgtattactg 
tcttataccc 
ttaaaaaatt 
gcccaggctg 
gctggggtta 
acaatttact 
acttcatact 
cgaggcgggc 
actaaaaata 
ggctgagaca 
accactgcac 
ttcagactta 
tatgtctctt 
attatttctg 
tgacacttca 
gttggtaata 
tttttttttt 
tctcggctct 
gccttagcct 
gtatttttag 
tcaagtgatc 
gcctagccac 
atgtatgtat 
ctaccttttg 
tttgtgaatt 
tttgaatgga 
tcaaataatc 
gatttagatg 
agttaaagtc 
ggggcaattc 
gtgtttttat 
ccacttaagg 
tatagtcaga 
atcaaaagtt 
gtctattttc 
gtttatttta 
tctgtcaccc 
aggttcaagc 
ccacacctgg 



taaaaaagaa 
gggatggttc 
ctgtaattgt 
agacggagtc 
aacctctgtc 
acaggcacgt 
ccgtgttagc 
aaagtgttgc 
aatgaagcag 
cgcagtggct 
agttcgaaac 
aagatggttt 
aaggcaagta 
gaagatatct 
gaatttttct 
cctcagcctc 
ataaaaggtt 
tttcttccct 
ttatgtaatt 
gacattccta 
atgtaatatt 
ttttgccatt 
agtattacag 
tttttttgaa 
gtctcaaact 
caggcatgag 
ttgtaaagta 
tttggctgga 
agatcaagag 
caaaaattag 
ggagaatcac 
tccagcctag 
attttttttc 
gattgttcag 
ttgggtgatt 
ttattcaaaa 
ccttaaaaat 
tttgagatag 
cagctcactg 
cccaagtagc 
tagagacagg 
cgcctgcctc 
tttttgcttt 
acacacaact 
ggtgcttctt 
ttagaattgt 
attgcactca 
cctactttaa 
aaacgcttta 
ttctgttcat 
atccgagaat 
acttcatatt 
tttaaactgt 
agtaagtaca 
agtgaacttg 
ttatttaatg 
ttaaattata 
agaccggagt 
gattttcctg 
ctaatttttg 



tggacagtaa 
atttatttaa 
gtgtatgtat 
tcgctctgtt 
tcccaggttc 
gccaccacgc 
caggatggtc 
tattacaggc 
aaaagagaaa 
cacacttgta 
cagcctggcc 
taatatatgt 
aatttgtatc 
ctaaaataag 
tttaaccgac 
ccaaagtgct 
aaatttctag 
tcctactcac 
gatattacgc 
tgttcatcta 
tactataact 
ttggtatcaa 
aggatctctt 
aaaatttttg 
cctaggctca 
ccaccatgcc 
tctgcatcat 
cacagtggct 
atcgagacca 
ctgggcgtgg 
ttgaacccgg 
caacagagtg 
cagtcttaag 
gttttttctt 
ttattagtga 
aaaaaaaaag 
aagaccctta 
agtcttgctc 
caactgcaac 
tgggattaca 
gtttcaccat 
ggcatcccaa 
ttaactttgt 
atggctttat 
ttacaaaatt 
gaattacctg 
attaaagatt 
attacttagg 
aattgattgt 
tcttatttag 
ctgagcctga 
aagcctttac 
ttattttgta 
agggcttcct 
ctttgccact 
gcaaaatacc 
ttttttctct 
gcagtggtct 
cctcatcctc 
tatttttagc 



acctaaatga 
gaccttacat 
aatgtaaata 
gctcaggctg 
aagcgtttct 
ccggctaatt 
tcaatctcct 
atgagccacc 
tgataatttt 
atcccagcac 
gacatggtga 
tttagtttta 
agttggtata 
atcactttgc 
ttgataaatg 
gggattacac 
ttaagtttaa 
agtaatcatt 
tccattctgt 
tacagactta 
catcactgta 
tgagtattta 
tttatatgct 
aaaaaaaatg 
agcaatcctt 
tggcctacat 
tttatgttct 
cacgcctgta 
ccctggccaa 
tggcgcaccc 
gaggtggagg 
agactctgtc 
tgtttgctaa 
ttatgaattg 
cttgttaaaa 
attctctatg 
ctgtattttt 
tgttgcccag 
ctctacctcc 
ggcatccacc 
gttggccagg 
agtactggga 
tttatagtac 
aatatgtttc 
gtcttggcta 
ttgactcacc 
atcttgcttt 
atagctataa 
tttctcctaa 
gaagatgaca 
acctgatgta 
tcacattagt 
aagtaaccac 
gtagtcacat 
ccagaaggca 
cgacctaagt 
tttctttttt 
gaccgcacct 
ctgagtagct 
agagatgagg 



gttcattccc 

aaagtctatc 

tatatgtttt 

gaatgcagtg 

tctgcctcat 

ttttgtattt 

gacctcgtga 

acacccagca 

tcttcatctt 

tttgggaggc 

aactccgtct 

tgattttagc 

ttggtaccca 

ctaaaataaa 

cattattctt 

acatgagcca 

tgtcctcttt 

cttatggtat 

acgttgtact 

cttcatttta 

gcagagcatc 

agtcatttgc 

tctttgtacc 

aaatgaagtc 

ccatcttggc 

tttaaatttt 

caccagtctt 

atcccagcac 

tatggtgaaa 

gtagtcccag 

ttgcagtgaa 

tcaaaaaaaa 

tgagattgag 

actgttcatc 

ttctgtatat 

tttctcgata 

tgcttttttt 

gctggagtgc 

ctgtttcaag 

accacaccca 

ctggtctcaa 

ttacaggcat 

tatagtttta 

agtcattgtt 

ttcttgtgcc 

atgttttgta 

ctgtgcagca 

attgtgtttc 

atttaaaact 

tttggaagag 

aggaaatcaa 

gattgactgt 

tgtatctttc 

ctttatgcaa 

catgaatatg 

tggacttaat 

ttttttgaga 

cactgcaacc 

gggactacaa 

tttcaccacg 
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ttggctaggc 
gtgctgggat 
ttcctcaaga 
actggattaa 
tgagttcatg 
gtctcaagga 
gagaacactt 
gctgggggag 
gcacaccaac 
ctagaactta 
tcaaaattca 
aacagaaatc 
acacaggaag 
gtttggttga 
attcaatttt 
cctgtaatcc 
accatcctgg 
atgcctgtgc 
gggattacag 
tggccaggct 
caaagtgctg 
tttttttttt 
gtgcgatctc 
tcttccaagt 
tagtagagac 
tctgcccacc 
atcttatttt 
caggcttggc 
cccaagtagc 
agtagagatg 
cctcccaaag 
ggctgcatgg 
gttggagccc 
ccacctaggt 
atactgggaa 
ggcattggcc 
gggtctcctc 
tgtttttttt 
atcttggctc 
caagtagctg 
tagagatggg 
ctcgcctctg 
ttttttgttg 
gtcgcccggg 
ttcaagccat 
cgcccggcta 
gtctcgatct 
ggtgtgagcc 
gacagctcat 
agcataggtg 
agcagttgaa 
ttcagtctgt 
gtatgcatct 
tacattatat 
atgaaaagca 
gtcaggaaag 
taagcccagg 
cgaagctaga 
gagcaaagga 
ggacttgaaa 



tggtctcata 

tacaagtgtg 

ttgatgaaag 

gaaaatgtgg 

tcctttgtag 

tagaaaacca 

ggacacaggg 

gaatagcatt 

atggtacatg 

aagtataata 

aattgttcat 

ttttttgtca 

atttcagaaa 

attgacagta 

gataagtatg 

tagcactttg 

ccaacatggt 

acccgcctcc 

gcgccatggc 

ggtctggtct 

ggattacagg 

ttttgtttga 

agctcactgc 

agctgggatt 

agggtttcac 

tcggcctccc 

ctttcttttt 

tcactgcaac 

tgggattata 

gggttttgct 

tactgggctt 

tggttcatac 

aggagggtga 

aatggagcaa 

gaggtcagtg 

aggaagaact 

cccagtatta 

tcctgagatg 

actgcaagct 

ggactacagg 

gtttcaccat 

ccttgcaaag 

ttgttattta 

ctggagtgta 

tctcctgcct 

attttttgta 

cctgatgtcg 

accgtgcctg 

gaagaagtgc 

ttcggtgttt 

gatatatagg 

ttatgttatt 

attgttctgg 

tggcggagac 

gggtaggggg 

gcctcactga 

cagcatgtgg 

gagctcagca 

atgagcagta 

gtgtcaggga 



ctcctgacct 

agccaccatg 

taatgaaata 

cacatataca 

ggacatggat 

aacaccgcat 

tggggaacat 

aggagatata 

tatacatatg 

aatttaaaaa 

ttaatcagaa 

agtgcattct 

caaatgtgga 

ttttattgag 

tttaagatta 

ggaagctgga 

gaaaccctgt 

gggtttaagc 

taatttttgc 

caaactcctg 

catgattcac 

gacggagtct 

aacctctgcc 

acaggcgcgt 

catgttggcc 

aaagtgctga 

ttttgtcggg 

ctctgccccc 

ggcacctgcc 

atgttgacca 

acaggcgtga 

ctgtaatctg 

ggctgcggct 

gaccatgtct 

gtggttttag 

ctacagtgtc 

atagaaaatc 

gagtctctct 

ctgcctccca 

tgtccaccac 

gtcagccagg 

tgctggagtt 

tttatttatt 

gtggcacgat 

cagcctcctg 

tttttagaag 

tgatccgccc 

gcctgatttt 

tcctgcttca 

gcagtttctg 

aaattttttc 

ccttcattca 

gtcctgggga 

agtaacagac 

ctgggagaga 

ggaggtggca 

aggaagagtg 

tgatcaagga 

gaaggtgagt 

cacattggaa 



caagcaatcc 
cctggcctta 
taaaagtaat 
ccatggatac 
gaagctggaa 
gctctcactc 
cacacgctgg 
cctaatataa 
taacaaagct 
aaataaatat 
gagtagttta 
ttgtgactga 
tccgtgacag 
taaaagatac 
agagctattg 
gcaggtgggt 
ctctactaaa 
gatcctactg 
atttttagta 
acctcaggtg 
catgtctggc 
tgctgtgtcg 
tcctgggttc 
gccaccacat 
aggctggtct 
gattacaagt 
tgggaggggg 
caggttctag 
accacgcctg 
tgctggcctc 
gcttgtattg 
agcactttgt 
gcagtgaatt 
ctaaaaaaca 
aacagaggaa 
tttaggtagc 
tctgagctgt 
ctgtcggcca 
ggttcacacc 
cacgcccagc 
atggtctcga 
acaggcgtga 
tatttatttt 
gtcggctcac 
agtagcaggg 
agacggggtt 
acctcggcct 
tttttttttt 
tatgtatatg 
tttgttttat 
ccaaaccact 
ttcattttat 
agaaaacaaa 
aaacaaatgt 
gtagtaggga 
ttttgagtag 
ttcttggtga 
acagcaagcc 
gagttgggag 
gttggagcag 



atccgccttg 

ttaaattatt 

gaaatatatg 

tatgcagcca 

accatcattc 

ataggtggga 

ggcctgtcgt 

atgacgagtt 

gcacgttgtg 

atgtggaaaa 

gtcaaatcca 

tttcattttc 

atggtatcta 

taatttttgt 

gccaggcgct 

cacgaggtca 

ttagccaggc 

cctcaggctc 

gagacagggt 

atctgcccgc 

catttatctt 

cccagagctg 

aagcaattct 

ctagctaatt 

cggaactcct 

gtgagccact 

acagagtcta 

caattattct 

gctaattttt 

aagtgatccg 

ggtaaaagaa 

gagactgaga 

gtgatcacgc 

aaacacaatt 

gtgccagatg 

ttctgtccat 

ttttttttgt 

ggctggagtg 

attctcctgc 

taattttttg 

tctcctgacc 

gccaccgtgc 

ttgagacaga 

tgcaagctct 

accacaggcg 

tcaccgcatt 

cccaaagtgc 

taatctggtc 

tgttagcata 

atgaattaag 

atctctgctc 

agaacagtgg 

gttcctgctt 

agcctgtgta 

gtgctatttt 

acctgagcgc 

aaggaacaag 

ccgtgtggct 

gtcaccagag 

ggaaatgatg 



gcttcccaaa 

tttattaaat 

tggaaaatag 

taaaaaagga 

tgagcaaact 

attgaacaat 

ggggtggggg 

aatgggtgca 

cacatgtacc 

tattaatagg 

agggttagac 

ttcctggttt 

gaagttttta 

aagaagaaaa 

gtggctcatg 

agagattgag 

9tggtggcac 

ctgagtagct 

ttcactacat 

cttagcctcc 

attttctttt 

gagtgcaatg 

cctgcctcag 

tttgtatttt 

gacctcgtaa 

gtgcccagcc 

gctctgtcgc 

gcctcagcct 

tgttattttt 

cccaccttgg 

caatattggg 

tggaaggagt 

cattgcactt 

tttttaagga 

acctttgtga 

aaggataatg 

ttgtttgttt 

ctgtggcgcg 

ctcagcctcc 

ttatttttag 

tcgtgatccg 

ctggcctggt 

ctctcgctct 

gcctgccagg 

ctcgccacca 

agccaggatg 

tgggattaca 

tcatacctct 

gtgttaacat 

gtgtattatg 

gttctattca 

agtgcctact 

tcatggaact 

catgtgttac 

cgaggtggtt 

agcgggggcg 

gatagaggcc 

ggaatggagt 

accatggcaa 

ggatttatgt 



141960 
142020 
142080 
142140 
142200 
142260 
142320 
142380 
142440 
142500 
142560 
142620 
142680 
142740 
142800 
142860 
142920 
142980 
143040 
143100 
143160 
143220 
143280 
143340 
143400 
143460 
143520 
143580 
143640 
143700 
143760 
143820 
143880 
143940 
144000 
144060 
144120 
144180 
144240 
144300 
144360 
144420 
144480 
144540 
144600 
144660 
144720 
144780 
144840 
144900 
144960 
145020 
145080 
145140 
145200 
145260 
145320 
145380 
145440 
145500 



.'SDOCID- <WO 01?7?57A2 ■ > 



WO 01/27857 



PCT/USOO/28413 



102/122 



ttttatgttt 
gcttcaagaa 
tgatggtggt 
ggtagaggca 
ttaatttaat 
aacacccctg 
ttttttggtg 
ttactatgtg 
gtctgtttta 
accctacaga 
ttccttgaca 
cattttagat 
ttggtctgta 
ttagagatga 
gaactttctt 
ggctttctgt 
tgctggcctc 
tatcttctat 
tggctgtggc 
tattatctct 
cactcactac 
gagatagggt 
ttgggctcaa 
caccatactt 
caagctcctg 
gtgagccact 
Cttacatact 
caatactttg 
ggcaacaaag 
acatgcctgt 
Ugttgaggcg 
accatgtgtc 
atgtgggcag 
rctagcacat 
rttcatttgg 
rtttttaaac 
:tacttccga 
rttagcactt 
itctacaaat 
-Lgggaggcc 
jgtaaagcct 
:ataatccca 
ratcctggct 
-ggtggcacg 
-cgggaggcg 

jagggagact 

lcgagtacct 
jaggtggagt 
jagactctgt 
?tctgtgatt 
tataccacct 
:tccttattg 
jagttttgaa 
ragtcttgct 
rtgcctccca 
jtgtcaccac 
Itcttgctct 
Itctcctggg 
:gtgccacca 
iccaggctgg 



agtgttttta 
gagaagcaga 
gtggattagg 
aaaagattat 
tgagacatgc 
tatatcctgg 
gtgtaggagc 
ccaggcacta 
ctccagcttg 
atgtttaatc 
tttcttgtaa 
tagaacacat 
cattaaaatg 
gaagaaagag 
gccctatgca 
tgcatttgga 
tatcacctta 
tcttttgctg 
aagccctgcc 
tcagagaggg 
cacttctttc 
cttgctctgt 
atgatcctct 
ggcttattat 
ccgcaagcaa 
actcctggcc 
tgtttatagc 
ggaggctggg 
tgagaccctg 
ggtcccagct 
acggtgagcc 
taaaaagtaa 
ggagtttgtc 
ggtaagtact 
gattatctga 
ttggttgccc 
ggatactcac 
tcaagctaat 
gtagaaattg 
aaggcgagcg 
tgcctctatt 
gcacgttggg 
aacacagtga 
cgcttgcagt 
gaggttgcaa 
ctgtctcaaa 
gtaatcccag 
ttgcagcggg 
cttaaaaaaa 
tgttaggaat 
tgacaatggt 
agggcagctg 
accttgataa 
ctatggccca 
ggttcaagca 
gcccagctaa 
gtcacccagg 
ttcaagcaat 
tccctagttc 
tctcgaactc 



agggattgct 
gaaacaacat 
ctggtagtgg 
atxtctacca 
ccacataaac 
ttcttctttt 
ctagagattg 
tgctgaatgc 
gttccttttt 
ttctgtactt 
actggaagtt 
catgtgttgt 
ttgcctgaat 
gtgcctttta 
tacgttattg 
atgaaatcta 
ctttgaacca 
aagtttcttc 
atggctttca 
accttcccaa 
ttttcttttc 
tgcccaggct 
cacctcagcc 
tttacttttt 
tccacatctc 
tattttctta 
ttatttctca 
ttggagaatt 
tctataaaaa 
acttgggagg 
atgattgtgc 
ataaaaatag 
tatactattt 
ccttaaatat 
gtggtaagat 
tttgccacac 
agaggccatt 
gcaattctta 
aagtctgggc 
gatcactgag 
aaaaatacaa 
aggccaaggc 
aaccccatct 
cccagctatc 
tgagctgaga 
aaaaaaaaaa 
ctactaggga 
ctgataatgc 
aaaaaaagaa 
cacacagcag 
ttgtttacag 
gaaagaattt 
tagagcacag 
ggctggagtg 
attctgcctc 
ttttctgttt 
ctggagtgca 
tcttctgcct 
atttttgtat 
ctgatctcag 



ctatcagcta 
tcttgccata 
aagaccagtc 
gcaagcccat 
taataaatag 
agttgtccag 
aatttattca 
caaggatgta 
aatgaccctg 
tcctggttgt 
acacctatag 
atatggtgtt 
ggatacacat 
cttttcaata 
cttaatcatc 
gcctctttgc 
ctcctttcat 
actttgagtg 
tgcaaggatg 
ctccgatgat 
cttttatctt 
ggaatcacga 
tctcgagtag 
gtagagacag 
tcagcctccc 
ttcactgtct 
gctggacatg 
ggttgagccc 
attgtttaaa 
cagaggtggg 
cactgcactc 
tttctctttc 
ggcactatat 
ttattgactg 
tacggattat 
tgacatagac 
ctcttctcaa 
gatgatgtat 
acagtggctc 
gacaagagtt 
caattagggc 
aggcagatca 
ctactaaaaa 

gggaggctga 

ttgcaccgct 
aaaaacaatt 
ggctgaggga 
accactacat 
agaaagaaat 
gttagtagca 
ttcggctccc 
tcatcattta 
aggaaaagac 
cagtgacacc 
agcctctcga 
ttgtttcgtt 

gtggtgcgat 

cagcctcccc 
gtttagtaga 
gtgatctact 



tctggaaaat 
gtcatagtct 
cagttcgggt 
ctatgaagtt 
gaatttctgc 
atgtctcttt 
cccaaaaggc 
aataagaggg 
acttgttaag 
gttatttagc 
tcttgatgat 
tttgaaagcc 
aaaatttaac 
taccttttcc 
cacctcatct 
tgttacctgt 
ggactgagct 
cctctgcagt 
gttcctcctt 
ctaaaatcct 
tttttttttt 
ctcactgcag 
ctggaactgc 
ggtttcacca 
aaagtattgg 
aaaattatct 
gtgcctcaca 
aggacttcaa 
aattagctgg 
agaatcgctt 
tagcctagtg 
atgactagaa 
ttcctgattc 
aattatttaa 
atttatgtaa 
actaagtttt 
tccccaaata 
ctgtgtatat 
tcacctgtaa 
aagaccagcc 
cgggcgtggt 
cgaggtcagg 
tacaaaaaat 
ggcaggtgaa 
gaactccagc 
agccaggcgt 
ggagaatcac 
tccagcctgg 
tgaggaatgt 
actacagggc 
cttcctctgc 
ctagcctata 
tgagttttct 
atctcagctg 
gtagctgaga 
ttgttttttt 
gttggctcac 
agtagctggg 
gatggggttt 
cgtctcagtt 



145560 

14S620 

145680 

145740 

145800 

145860 

145920 

145980 

146040 

146100 

146160 

146220 

146280 

146340 

146400 

146460 

146520 

146580 

146640 

146700 

146760 

146820 

146880 

146940 

147000 

147060 

147120 

147180 

147240 

147300 

147360 

147420 

147480 

147540 

147600 

147660 

147720 

147780 

147840 

147900 

147960 

148020 

148080 

148140 

148200 

148260 

148320 

148380 

148440 

148500 

148560 

148620 

148680 

148740 

148800 

148860 

148920 

148980 

149040 

149100 



WO 01/27857 PCT/USOO/28413 



103/122 



tcccaaagtg 
ttcaccatgt 
cctcccaaag 
aaaagagcct 
atcatctaag 
aatgtagtgg 
tgataaatcc 
agagatttct 
ttgttgccga 
ggttcaagca 
cacgcctggc 
tcgaactcct 
gcgtgagcca 
cctgtcagta 
agccagtgga 
ggtagcaaga 
gaaatttggt 
ggttttctaa 
ggggtgattc 
actctggttg 
attgagggaa 
gaccttgcct 
tgagcatgtt 
ggctggccta 
gggagttaag 
cagctagttt 
ttttatatag 
atgtattgct 
tttgtttttt 
agagtagaag 
gtttgcaagt 
tccttagtgt 
taaataagag 
agtctcgctc 
tgctcccagg 
cctgctacca 
ccaggctggt 
tggaattaca 
atgcatgaaa 
atgaagaaat 
tttctgttta 
tcccatgttt 
gtggtccctt 
gaggctagga 
gtaatcagtg 
agtagtggag 
aaaggaaaat 
tagttgacac 
gttttttcgt 
catttgttaa 
cagtttaaat 
agtcagtgac 
aaggatgact 
agaacataag 
cataagcaac 
gggcaataca 
gttaaatatc 
gaggccaagg 
tgaaaccctg 
ctgaggcagg 



ctgggattat 
tggttagact 
tgctgggatt 
tctgagatta 
agacagtgta 
caatcccttg 
cttacatgtc 
tttttttttt 
ggctggagtg 
attctcctgc 
taattttgta 
gacctcaggt 
ccgcgcccag 
gaatgagaat 
attagctggt 
attagaggga 
ggcagatttc 
catgtcaata 
aacagccagt 
gcagtaaaat 
aaggatccag 
caagaaataa 
caaattatta 
cttcggatga 
gcgtactctg 
atatagacta 
gagctggtct 
agattcttac 
aatgtttgcc 
aaaagtccag 
taaatgctct 
tttggggtat 
cattgtggtg 
tgttgcccag 
ttcaagcaat 
tgcctggctg 
cttgaactct 
ggcatgaacc 
tgtacattca 
gggtgcaagg 
taagtgccac 
aagcagcatg 
cttaacatct 
ccactgaagg 
tgtagatcaa 
gcttgctttt 
gttttatttt 
tggaagacag 
acatgggaat 
gagcactgag 
tattttcttt 
attatgcagc 
tcgttttgtg 
gggtttcttt 
tagaaaattg 
gggaaaccat 
aaagttcagg 
cgggtgaatc 
tcttagccgg 
agaatcgctt 



tggcacacgc 
ggtctcaaac 
acaggcgtga 
tgagaagggc 
acaagaagga 
tgcttcgata 
attttaagga 
tttttttttt 
caatggcgtg 
ctcagcctcc 
tttttagtag 
gatcctcccg 
cccagagatt 
gaattggagg 
aatgttgata 
aggtcggatt 
atgtgtaaat 
gagtgactct 
tgagccttca 
ttcattaaac 
gttttgtatt 
tctaccaaca 
aataaaaaag 
ttctcagcat 
gcttggatag 
gagaactaga 
ggaaggtttg 
ccaagagcat 
acaaactaac 
aactctgaaa 
gttatgtaag 
cacacaaaaa 
gtggtggtga 
gttggagtgc 
tcttctgcct 
atttttatta 
taacctcagg 
accatggcca 
attttgtctt 
aaatactgat 
cctcatgtaa 
gcacatttat 
aacaattgcc 
atatacatgc 
aagctcaaat 
ttaatagtta 
gaccatctaa 
ggaatgacat 
gaaattctta 
tatgtgcatc 
aggcccagga 
aggctcagta 
taaactaaaa 
gcctttgaag 
acaaactaaa 
ggaaaccaaa 
ccaggtgcag 
acttgaggtc 
gtgtggtggc 
gaaccaggga 



ctatttttgt 
ttctgacctc 
gccaccgtgc 
aagcaagata 
attgtaaaat 
cattggtggg 
gcttagactg 
tttttttttt 
atctcggctc 
cgagtagctg 
agacggggtt 
cctcagccac 
tctaaacaga 
tgggagagac 
ggagaagaaa 
tatgatatgt 
tgggaaggta 
gcaggggggc 
tgcagagcat 
caatatttaa 
ttttatgaat 
attaacttgt 
taagctgtgt 
gtgattacag 
agtagagctc 
atgtagcagc 
aaaacataac 
tatcctggtt 
actagatgtt 
caccttttca 
caatataatc 
agaatatcca 
tagtggtttt 
agtggcacga 
cagcctcctg 
ttttagtaga 
tgaatcaccc 
gccaaataag 
tgtttactag 
gaggtaaatc 
gtgaggttta 
gttctcttac 
tggtagtagc 
attcaagttt 
gtttccttcc 
attaagtaca 
tatgaaagta 
gttaatattc 
tccaaataag 
tcgatccatc 
agagctagct 
tgatcaaccg 
agtattattt 
gattaactgc 
tgaaacaact 
cagagcccag 
tggctcacgc 
aggagttcaa 
aggcacctgt 
ggcggaggtt 



atttttagta 

aagtgatttg 

ccagccaaga 

acttaagaag 

gatgttatga 

agacaaaact 

actcccatca 

tttgtgacag 

accacaacct 

ggattacagc 

tctccatgtt 

ccaaagttct 

gttctaacca 

tggcatgagg 

aagattcaaa 

ccaaggttga 

gattgagttt 

ctgacgagag 

ttaacactgt 

acccttaggt 

tcagttattg 

tttaaagcaa 

atttcattca 

atgtgggctt 

tttgaaactc 

atactctgtc 

aaatgtgttg 

agggtttggt 

agttctttca 

aaagtttttc 

agtttttatt 

tatctggaag 

tttttttttt 

tctcagctcg 

agtagctggg 

gacaggtttc 

acctcggcct 

agcattttta 

gatccatgtt 

ctacctttag 

aaattttcct 

ccagaatgta 

agtgaaggta 

ccatcagcca 

ccactggcag 

ttgagagatg 

gttcggtgtt 

atagccagag 

tagaaattat 

taatgaataa 

tggaagattg 

ttagagaaat 

tccaggtgta 

tgtggggatt 

gtttgcatat 

tagtcttgct 

ctgtaatccc 

gaccagcctg 

aatcccaact 

gcagtgagcc 



gagacggggt 
cccgccccag 
ttgagttttg 
ttacattaaa 
gcacgtgccc 
gtacttaaat 
tgtagacatc 
agttttgctc 
ccacctccca 
catgcaccac 
gtggctggtc 
gaaattacag 
gatgcttttc 
gacaccagtc 
gttaggtagt 
attctaaggt 
ttttaacatg 
aacagtgcat 
gactctgtag 
aataataaaa 
aattaaacag 
agttaggaag 
tagaaataga 
atacatccta 
ttctctcacc 
ttagaagccc 
gtgtctccca 
ttggttttgt 
tcaagtgagg 
aagccatgat 
aatgtaacat 
caacagcttt 
tttgagttgg 
cttcaacctc 
attataggca 
accatgttgg 
cccaaagtgc 
atgtaaaatt 
ctcacaagct 
gataaaaaga 
tttctttagg 
ccaagaaagg 
tcttcagtca 
gcaggcatca 
ttttacttca 
ggaggtgaaa 
aggtatccag 
99tggcccag 
gtgcgtaagc 
ccattatcac 
ctaaaatgat 
ctacaaaggt 
aaaataaaaa 
accttcttat 
attggacaat 
gaacgaaaga 
agcactttgg 
gccaacatgg 
atttgggagg 
gagatcacac 
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cactgcactc 

gagagctcaa 

aaagaaagcc 

gtatatgtgt 

ctgaatgatt 

gtagagagac 

ctactgaaga 

ttaaagggtc 

taacacaaca 

gcgcagtggc 

gaggtcagga 

acaaaaactt 

ggcaggagaa 

gcactctggt 

cgtcaatcac 

tctatagaaa 

gtggctattt 

aatcccagca 

gtgaaacccc 

atagtcccag 

ttgaggttgc 

actctgtctc 

gcaaaatagt 

ggccgggcgt 

atcacctgag 

taaaaataca 

ggctgaggca 

accagtgcac 

aaaacagtag 

ttttgcaggt 

ccaaaacccc 

agtcctaggc 

caacctccaa 

acaccattac 

cttccgtgat 

gtgtgtcttt 

tctattgaaa 

tctggatgcc 

tgttcactca 

cttctggctg 

tttgtaataa 

tttcagttat 

tttttccctt 

gtcactgttg 

acctgtaggc 

tggtattgtt 

gagtgtttta 

gaactgatac 

ggtgccagag 

aggcagcttt 

tatgcagtct 

tttcagtgac 

tcctgagatg 

ggtttacttt 

tcccttcaga 

gtggcacctt 

gagcaggatg 

gtgagctgga 

ctcactcact 

gactgcccag 



cagcctgggc 
tttgagtaga 
gtagagatat 
ggggtgaaaa 
gctggaacat 
ttagtaatac 
taaattagtc 
aaattaccta 
aaattcagca 
tcatgcctgt 
gttcaagacc 
agccaggcat 
ttgcctgaac 
ctgggcaaaa 
aaattaccaa 
cagacccaga 
taaatattaa 
ctttgggagg 
ttctctacta 
ctatatggga 
agtaagccga 
aaaaaaaaaa 
tttaaaagaa 
ggtagctcac 
atcggtagtt 
aaattagctg 
ggagaattgc 
tccagcttgg 
actcgaagaa 
gactacttta 
aaaagaaaaa 
ctgcgaagga 
ctcgacattt 
tgaattgctt 
atttgtccgc 
accatccctc 
attggcaata 
agtccttgtt 
tattttaaca 
tcttaaactc 
taggtcactt 
gactcaaaac 
ggttggattc 
tttccaggag 
ataattgatg 
tacattacta 
gcctactgga 
ttgatgcaca 
gtggtttgct 
gagtggcagc 
tcttgacagc 
ttcctccttc 
cactatactt 
gagcattagg 
atacatgtgg 
gaacaaagca 
actggggaga 
aagtgtgcag 
gcacttctgc 
ttggtgagca 



gacgagcgaa 
agttgtagga 
ttagagagat 
cgcatgtgtc 
agggctaaga 
acaaggcatt 
ctagagtaca 
acaactgcat 
cttcacagtg 
aatcccagca 
agcctggcta 
ggtggccggc 
ccaggaggtg 
agagcaaaac 
gcatgacatg 
tgtgagaaag 
aaatatgttc 
ccaaggtggg 
aaaatacaaa 
ggctgaggca 
gattgtgcca 
aaaaaaaagt 
ccaaatggaa 
gtctataatc 
caaggccagc 
ggcgtggtgg 
ttgaacccgg 
gccacaagag 
ctagctgagt 
gttcctcatg 
ccccttctaa 
a^tactcattc 
gcctatagga 
catgtgcaca 
agtgctgtga 
ttgaatatgc 
tttttcattc 
atatgcccct 
aggaaaatta 
tggtatatag 
gttagagaaa 
ttgagataaa 
ttcaacacag 
agaatgggag 
cacatgatgt 
gaaaattatt 
atagacaggg 
ctcgtagtgg 
ttgtccttcc 
gtggtgctag 
ggcattaatt 
ctctgatggc 
aaaaccattc 
atttgcccct 
aaagaaagaa 
gttcttccca 
gagaaacatt 
ttggtcgtct 
tcagttggct 
cactccattg 



accccatttc aaaaaaaaaa 
gaaaagagga 
ccttggccta 
accccccaga 
gccagaagga 
ttcacagagg 
ccaagtttca 
aggcctaacc 
atgtctgacg 
ccgaggcagg 
accccgtctc 
ccggctactt 
tgagccgaga 
aaaaaaaaaa 



taaggtagca 
tcccatggat 
caggtagaga 
aaagttcatg 
gggtagtgtc 
agcacctgaa 
gccaaaacaa 
taaagttaga 
ctttgggagg 
acatggtgca 
acctgtgatc 
aaggttgcag 
tcaggctcaa 
aagttgacct 
atgatgaatt 
aagtggccag 
taggagttca 
aaaattagct 
caagaatcac 
cttgtactcc 
taaagaaaac 
tttcttaaaa 
ccagcacttt 
ctgaccaaca 
cgcattgcct 
gaggcagagg 
tgaaactccg 
ttttctttac 
tcctcattag 
tcctcattcc 
tctttatcct 
tgtacttgga 
tgtcccatgc 
ctacaggagg 
tctagggtta 
taatatctat 
tgggtaagtt 
caatatttta 
taaacactaa 
tgcaccttac 
ggaaatctgc 
ccaatgaaaa 
acaatcctag 
tcacacagtg 
agttttccaa 
accacatcct 
taactcatcc 
aaagcaggtg 
cagcttcagc 
tggaaggaaa 
agtatatagt 
tctcccctgc 
ttggaattct 
agaaatagcg 
aattatactt 
tgactttgac 
ttcttctcct 
tctgcatcgg 
accacgcggc 



ataaccagga 
tagcagacaa 
gtgcagtggc 
agaccagctt 
gggcatggtg 
ttgaacccgg 
agcctggaca 
aagagtataa 
taaaaaatac 

gtgggggctg 

tggagaaacc 
gtaatcccag 
ttgcggtgag 
tctcaaaaaa 
tttaggcagt 
tagatcagag 
atgattttat 
gtgttgatac 
cattcagcat 
cacaataccg 
gagtcagtga 
attcctagaa 
tgccaacatg 
acgtaacctc 
cctcacaaaa 
gtgttggtgt 
cattttcttt 
ttgtgaaaaa 
cagcactata 
acttccacca 
agagtcttaa 
tggcaataac 
ctgggaagca 
ctaatcagca 
agtcagcccc 
ggaacagggt 
ctgacaagtc 
tttcacattt 
taacagaagg 
gcactccagt 
atgactccac 
tttttttttt 
tgcctccccc 
ttctttagga 
gatcacacag 
gccagcgctt 



tcaaagttca 

agctgcccag 

ggagtgatct 

aattagtagg 

tctggccaga 

ttatgcctta 

aagcaaattt 

ctctttacag 

tccaggctgg 

tagatgacct 

tattaaaaat 

gggaggctga 

tcgcaccact 

gaatgtctga 

gaaaactcaa 

agaccatcaa 

tcatgcctgt 

ggccaatatg 

gcaggtgcct 

gaggtggagg 

acagagtgag 

tgagaaaaat 

cagaaatggg 

aggcaggcag 

tcatctctac 

ctacttggga 

ctgagattgc 

aaaacaaaaa 

aagtgtgacc 

aaattcgaca 

gaatgcatga 

ctctctgctt 

aaactacctc 

gggaccttgt 

atgtctgcat 

gtagaattac 

ggaaagcaag 

tttaagcttc 

ttgtagtcag 

ccatccttaa 

tcttttcttt 

taagagaact 

tttctgatct 

taatgcagtt 

agatacaaaa 

ccatttatga 

gataagcata 

ttgtaaagca 

accgagagcc 

gagagttaat 

atgggtcaag 

taattcctcc 

gtgtgaatct 

tacttaactt 

ttttgcccct 

taaataaggt 

attctttgct 

tagtaagaga 

ccatcagcag 

cctcaatgca 
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catgattgag aggaaagaaa 
aaaaaaaata tatatatata 



aagtcttgaa 
gtttcatcga 
ctgttattct 
ttgattttaa 
gggcatttat 
ctaaaaaccc 
aaacccaagg 
aatatttcca 
agaataaaat 
ctacaagact 
tttgtatata 
aagagaatag 
acagaagcgg 
ccttgtactc 
caggctctgt 
ataatcagtt 
aaagaaaaga 
gtaagatttg 
agggagtaac 
aatgtaagta 
gtcattctga 
taagcacctg 
ttcaggagtg 
tgcccacaca 
gttcttaatg 
tttgggaggc 
catggtgaaa 
ctgtaatccc 
gttgtgcaat 
tgtctccaaa 
taattacagc 
ccagcctgga 
tggtgatggg 
cccaggaagc 
agagcaagac 
aatcccagca 
cctggctaac 
tggtgggcac 
aggaggcgga 
gtgagactcc 
gtcccagcac 
gtccggccaa 
gtggtggcgg 
ggttgcagta 
aaaaaaaaag 
ggatgtggca 
ctgagatagc 
catccttgca 
ttttttgaga 
ctcaccacaa 
ctgggattac 
gtttctccat 
ggcctcccaa 
ttttgaaacc 
tgccatactg 
gatttctgga 
caaaatggta 
tactgaggcc 



tgaacagaga 
cagaagaagc 
tttaaagctg 
aatatatata 
ggctcttctg 
agaacatgag 
tttttattag 
tccacttcac 
gtggttttgg 
ctaggctgtt 
cattttgtgc 
cagcaaattt 
agtgtggccc 
attttaaagt 
gcattgtgcc 
gaacaccctt 
tattagagag 
atgaaagtaa 
tttttataac 
gtttacagta 
attgtaacaa 
atgaagtgac 
gggtttatgt 
ccagttgatt 
ttaagaattt 
cgagacaggc 
ccctgtcttt 
agctacgtgg 
gagccgagac 
aataaaaata 
attttggaag 
caacatggtg 
cacctataat 
agagattgca 
tctgtctcaa 
ctttgggagg 
atggtgaaac 
ctgtagtccc 
gcttgcagta 
gtctcaaaaa 
tttgggagac 
tatggcgaaa 
gcacctgggg 
agccaagatc 
aattttgcat 
cttacaaaat 
aggtaccttg 
ttggactaca 
tggagtttcg 
cctccacctc 
aggcatgcgc 
gttggtcagg 
agtgctggga 
agtctgaagt 
ccctaatgcc 
gaataatttt 
tcctaaccta 
caggaagggg 



gttctcttag 
tgtataaata 
atttattcca 
aaaaaggctt 
aagtattcat 
tatgaattct 
ttgaaatata 
ccactactgg 
tcatctatgc 
attgcttcaa 
gcaactcttg 
taaactagtg 
ttctgagcta 
gagactcggc 
gaaattatta 
tggaatttga 
cacaaaataa 
catctttatc 
aaagtggtac 
aaagcaaatg 
tttttctact 
ctggaggttt 
agtacaaact 
tgacctctct 
ttctacacag 
ggacctgggt 
tggggccggg 
ggatcacttg 
actaaaaata 
gtggctgaga 
cgtgtcactg 
agaaaaagaa 
gcccaagatg 
aaactccatc 
cctagctcct 
gtgagccaag 
aaaaaaaaga 
ccaaggcagg 
cctgtctcta 
agctactagg 
agccaagatc 
aaaaaagaat 
caaagtgggc 
ccctgtctct 
aggctgaggc 
gtgccactgc 
ggggaaggag 
caggagccag 
ataaccctga 
ttaatctgtc 
ctcttgttgc 
ccaggttcaa 
caccacacct 
ctggtctcga 
ttacaggcgt 
gagttttttt 
taatgattat 
tctttagtaa 
atggagctaa 
agaagtccct 



atgttactgc 
tataattatt 
ttgcaatatt 
tgtgtaagtt 
gtacttaaac 
atttaaaatt 
ttgatctttc 
actttgcctt 
tgtgattaat 
tctttaacag 
ctgcctctgc 
ctttcagtta 
gagatgccaa 
tacttttttc 
gccagattta 
ttcctccaac 
gattccctgg 
atgttgttga 
ctttgtaact 
tcagccaaat 
tggatttcaa 
gactagttca 
tctttgctgt 
ccagtgacag 
tgaccttttc 
tgaactcctg 
cacggtggct 
aggtcagggg 
caaaaattag 
caggggaatc 
cattccagcc 
ttttgggcta 
ggcagatcac 
tctactaaaa 
cgggaggctg 
atcacatctc 
atttggccag 
cagatcacga 
ctaaaaatac 
gaggctgagg 
gtgccactgc 
tttggccggg 
ggattacctg 
tactaaaaaa 
agggagaaat 
actccagagc 
agatactgtt 
cactgcatgg 
agacatcctt 
agttatcctt 
ccaggctgga 
gtgattctgc 
ggctaatttt 
actcccaacc 
aagccatggt 
aattacgtga 
gtattctcag 
acttcactta 
aagacacccc 
ggcttgtgag 



ttttgctcag 
aatcactttt 
tgattgtata 
tttggtacta 
catattatat 
gtgtcaactt 
caaatatttt 
gtgtttgaag 
tcattttgtt 
aaaagcaata 
atgttttgga 
agataaattc 
gtagttgtaa 
tgccccacct 
atatttgatc 
attgagcacc 
tggagttttt 
cattgacaca 
tgatgtgtct 
ccagtgaaca 
cattcagtag 
gtaggaattt 
tttatttaag 
tgtttgggta 
tctcgccctc 
atccagacag 
catgcctgta 
ttcgaggcca 
ctgggcatgg 
gcttgaacct 
tgggtgacag 
ggtgcagtgg 
ttgaggacag 
agacaaaagt 
gggcaggaga 
tgcactccag 
gcgcagtggt 
ggtcaggaga 
aaaacattag 
cagaggaagg 
actacagtct 
tgcggtggca 
aggtcaggag 
aatacaaaaa 
gcttgaaccg 
aagactcttt 
caccatctgg 
acaaacagaa 
ggtttctgca 
ataatgattt 
gtgcaatggc 
tgcctcagcc 
gtatttttag 
tcaggtgatc 
acccggtctg 
aaggagtttg 
catgtctgca 
agtcgtcatg 
ttgtttttat 
atgatcacca 



actttgcaaa 

gtccttgaga 

gaggcacact 

tgtaccacct 

ttaattgtgt 

tctgctttca 

catttgcttt 

tgtatggcat 

cttttaacaa 

taaaggttat 

ataacaattt 

taatcatttc 

actgcttata 

gctttgagac 

taaagtaggt 

caccatgttc 

atgggttcaa 

aattgtttaa 

tcatcattcg 

gcaataaaac 

agcttttcga 

9gaggggaag 

tactgagagc 

cctgcctgac 

tcctccctct 

gcccaagaca 

attgcaacac 

gcctggccaa 

tggcgcacgc 

ggaggcggag 

agggagactc 

ctcacgcctg 

gagttcgaga 

tagccagatg 

atcacttgaa 

cctgggcaac 

tcacgcctgt 

tcgagattgt 

ccgggtgtgg 

atgtgaaccc 

gggcgacaga 

catgcctgta 

ttcaagacca 

ttagccaggt 

gggaggcaga 

ctcaaaaaaa 

aatggtgctt 

gcatgtgggc 

tctattcctg 

ttgatttttt 

acgatctcgg 

tcctgagtaa 

tagagacggg 

accctgtctc 

ttttttgatt 

gctaaaatac 

aagtactgct 

tgtattctct 

aacaagcagt 

ttagaactca 
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ggcctgggcc 
accgctttta 
agaagtttgg 
ctaggcaagg 
aatagttacc 
cttgggcagg 
tggagatggc 
tgagggcatt 
tgggtattcc 
aactgcctgc 
tgcctaaagg 
gtcagactgg 
gagctgggat 
ccagcttcag 
aggctggcct 
tccagctgaa 
agagtacctt 
agtcaccctg 
ggcaccttgg 
atgctggtat 
ggtgggcaga 
tctctactaa 
gctacttggg 
gccgagatgg 
aaaaaatatg 
tacccaccaa 
tgcagccgag 
tccgcaagga 
tcaggtgttg 
caagtctctg 
cagctccgtg 
cagttactat 
gctaattcca 
gagtaaacac 
actgattttc 
tgttgcccag 



agtgcctttt 
gcaattgtaa 
cgggagagat 
tctctggcct 
cactgaggcc 
actgggcgtg 
cagtgatgtc 
catgttcagg 
tgccttagta 
tctggcacat 
tgacagtgca 
cccagtctgt 
ttaccaagaa 
ctgactcctg 
ctcccacatc 
ggaagcccca 
ttgtttcttc 
taagtggcag 
agcttgcatt 
tggccaggcg 
tcatgaagtc 
aagtaaaaaa 
aggctgaggc 
caccactgca 
ctggtagttt 
ggtctactgg 
agggggagct 

ccagttgctg 
agcacctggt 
aacagggctt 
gctctgcaga 
atgcatgcag 
taaagtaatc 
tgtgtgtgcc 
tcccttcttt 
gctagtcttg 



catgcttctc 
taaacccaga 
aatttttaca 
tgtaaaaccc 
ctctccgggt 
gtgcagagta 
caataaagga 
gagggttgct 
actttatgta 
tcagaatgtc 
tctccttccc 
gggcaaggag 
gcaaatgaga 
tatattgact 
acagtaagaa 
cacttctttc 
taattatgta 
gcactgttta 
ctattgaaga 
cagtggctca 
aggagatcga 
aaaaattagc 
aggagaatgg 
ctccagcctg 
tgattcaaga 
aaaacatcag 
gaagagaagt 
tgccactcca 
ttacaagatg 
accttagagt 
gctttgggac 
tataaaatta 
agctcctgag 
aggcagcgtc 
aaacaaagtt 
aattc 



agatccttcc 
aatagaaagc 
aaatttgtaa 
ctcaaggtta 
gaacattgag 
ggagcggtga 
cactggaggg 
gcccactggc 
aacaagtatt 
acagaactca 
caccccaccc 
cctagagagg 
gacgaggatt 
gtgccttcag 
ttccacacac 
aagtttttct 
actattggtt 
cagggacaca 
ggtaatggaa 
cacctgtaat 
gaccatcctg 
caggtgtggt 
tgtgaaccca 
ggtgacagag 
tggcctttgg 
gctctcctgc 
gccccttctg 
ttcacttgct 
tcagcatctt 
aaggcttaga 
atgtgaattc 
taaccttgga 
ttctgcagtg 
tcatttgatc 
tttttttttt 



aaagaataat 
tttttggtta 
atacctgcca 
caactttggt 
cactagagga 
tactgtggat 
agcagtgtga 
ttgcttggca 
tcctcagtct 
cctggatgca 
ctcataccac 
gcttagtttc 
gcaacaactg 
actcatccgt 
catacaactt 
tagtcttctc 
tagtaaatat 
ggaaggaata 
gttgggatag 
cccagcactt 
gctaacatgg 
ggcgggcgcc 

ggaggcgaag 
cgagactctg 
agcccatgat 
tatagaccca 
tgtcctgtca 
gcaagactgg 
gatgcctgag 
agaggccgta 
ttaaaaacaa 
aaatcctagc 
gtaataataa 
cttgtgataa 
ttttagagag 



gaagattata 
gagtactggt 
attctatata 
ggcccacact 
agcccctctg 
tctgggcagg 
gtaaaggccc 
cacaggagag 
gttcctctca 
ttcagcccct 
tgaagcacct 
agcttgaaag 
tgccatttcc 
aagtgacccc 
ggaaagaggc 
ttcttggcaa 
tcacccattc 
aaaacttgca 
cagctaaact 
tggaggccaa 
tgaaaccccg 
tgtagtccca 
attgcagtga 
tctcagaaaa 
ttaggtctcg 
tagggagagc 
gcctcatcct 
aggtttttcc 
accatcaagg 
aagtcagtct 
gactattgta 
tagctgttga 
tcagcataat 
tcttgtaagt 
ggtctcacta 



159960 
160020 
160080 
160140 
160200 
160260 
160320 
160380 
160440 
160500 
160560 
160620 
160680 
160740 
160800 
160860 
160920 
160980 
161040 
161100 
161160 
161220 
161280 
161340 
161400 
161460 
161520 
161580 
161640 
161700 
161760 
161820 
161880 
161940 
162000 
162025 



<210> 37 

<211> 1350 

<212> DNA 

<213> Homo Sapien 



<220> 
<221> CDS 
<222> (213) 



(920) 



<300> 

<308> GenBank AJ242973 
<309> 1999-10-26 



<400> 37 

gcggccgcgt cgacgtgaca gccggtacgc ccgggtttgg gcaacctcga ttacgggcgg 60 
cctccaggcc cgccagcagc gccccgcgcc gcccgcccgc gcccctgccg ccccccggtt 120 
ccggccgcgg accccactct ctgccgttcc ggctgcggct ccgctgccgg tagcgccgtc 180 
ccccgggacc acccttcggc tggcgccctc cc atg etc teg gee ace egg agg 233 

Met Leu Ser Ala Thr Arg Arg 
1 5 

get tgc cag etc etc etc etc cac age etc ttt ccc gtc ccg agg atg 2 81 

Ala Cys Gin Leu Leu Leu Leu His Ser Leu Phe Pro Val Pro Arg Met 
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10 15 20 

ggc aac teg gec teg aac ate gtc age ccc cag gag gee ttg ccg ggc 32 9 

Gly Asn Ser Ala Ser Asn lie Val Ser Pro Gin Glu Ala Leu Pro Gly 
25 30 35 

egg aag gaa cag ace cct gta gcg gee aaa cat cat gtc aat ggc aac 377 
Arg Lys Glu Gin Thr Pro Val Ala Ala Lys His His Val Asn Gly Asn 
40 45 50 55 

aga aca gtc gaa cct ttc cca gag gga aca cag atg get gta ttt gga 425 
Arg Thr Val Glu Pro Phe Pro Glu Gly Thr Gin Met Ala Val Phe Gly 
60 65 70 

atg gga tgt ttc tgg gga get gaa agg aaa ttc tgg gtc ttg aaa gga 473 
Met Gly Cys Phe Trp Gly Ala Glu Arg Lys Phe Trp Val Leu Lys Gly 
75 80 85 

gtg tat tea act caa gtt ggt ttt gca gga ggc tat act tea aat cct 521 
Val Tyr Ser Thr Gin Val Gly Phe Ala Gly Gly Tyr Thr Ser Asn Pro 
90 95 100 

act tat aaa gaa gtc tgc tea gaa aaa act ggc cat gca gaa gtc gtc 569 
Thr Tyr Lys Glu Val Cys Ser Glu Lys Thr Gly His Ala Glu Val Val 
105 110 115 

cga gtg gtg tac cag cca gaa cac atg agt ttt gag gaa ctg etc aag 617 
Arg Val Val Tyr Gin Pro Glu His Met Ser Phe Glu Glu Leu Leu Lys 
120 125 130 135 

gtc ttc tgg gag aat cac gac ccg ace caa ggt atg cgc cag ggg aac 665 
Val Phe Trp Glu Asn His Asp Pro Thr Gin Gly Met Arg Gin Gly Asn 
140 145 150 

gac cat ggc act cag tac cgc teg gee ate tac ccg ace tct gec aag 713 
Asp His Gly Thr Gin Tyr Arg Ser Ala lie Tyr Pro Thr Ser Ala Lys 
155 160 165 

caa atg gag gca gee ctg age tec aaa gag aac tac caa aag gtt ctt 761 
Gin Met Glu Ala Ala Leu Ser Ser Lys Glu Asn Tyr Gin Lys Val Leu 
170 175 180 

tea gag cac ggc ttc ggc ccc ate act acc gac ate egg gag gga cag 809 
Ser Glu His Gly Phe Gly Pro lie Thr Thr Asp lie Arg Glu Gly Gin 
185 190 195 

act ttc tac tat gcg gaa gac tac cac cag cag tac ctg age aag aac 857 
Thr Phe Tyr Tyr Ala Glu Asp Tyr His Gin Gin Tyr Leu Ser Lys Asn 
200 205 210 215 

ccc aat ggc tac tgc ggc ctt ggg ggc acc ggc gtg tec tgc cca gtg 905 
Pro Asn Gly Tyr Cys Gly Leu Gly Gly Thr Gly Val Ser Cys Pro Val 
220 225 230 

ggt att aaa aaa taa ttgctcccca catggtgggc ctttgaggtt ccagtaaaaa 960 
Gly lie Lys Lys * 
235 

tgctttcaac aaattgggca atgcttgtgt gattcacaat cgtggcattt aaagtgcaca 1020 
aagtacaaag gaatttatac agattgggtt taccgaagta taatctatag gaggegegat 1080 
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ggcaagttga taaaatgtga cttatctcct aataagttat ggtgggagtg gagctgtgcg 1140 

gtttcctgtg tcttctgggg tctgagtgaa gatagcaggg atgctgtgtt cacccttctt 1200 

ggtagaagct aaggtgtgag ctgggaggtt gctggacagg atgggggacc ccagaagtcc 1260 

tttatctgtg ctctctgccc gccagtgcct tacaatttgc aaacgtgtat agcctcagtg 1320 

actcattcgc tgaaatcctt cgctttacca 1350 

<210> 38 

<211> 235 

<212> PRT 

<213> Homo Sapien 

<400> 38 

Met Leu Ser Ala Thr Arg Arg Ala Cys Gin Leu Leu Leu Leu His Ser 

15 10 15 

Leu Phe Pro Val Pro Arg Met Gly A6n Ser Ala Ser Asn lie Val Ser 

20 25 30 

Pro Gin Glu Ala Leu Pro Gly Arg Lys Glu Gin Thr Pro Val Ala Ala 

35 40 45 

Lys His His Val Asn Gly Asn Arg Thr Val Glu Pro Phe Pro Glu Gly 

50 55 60 

Thr Gin Met Ala Val Phe Gly Met Gly Cys Phe Trp Gly Ala Glu Arg 
65 70 75 80 

Lys Phe Trp Val Leu Lys Gly Val Tyr Ser Thr Gin Val Gly Phe Ala 

85 90 95 

Gly Gly Tyr Thr Ser Asn Pro Thr Tyr Lys Glu Val Cys Ser Glu Lys 

100 105 110 

Thr Gly His Ala Glu Val Val Arg Val Val Tyr Gin Pro Glu His Met 

115 120 125 

Ser Phe Glu Glu Leu Leu Lys Val Phe Trp Glu Asn His Asp Pro Thr 

130 135 140 

Gin Gly Met Arg Gin Gly Asn Asp His Gly Thr Gin Tyr Arg Ser Ala 
145 150 155 160 

lie Tyr Pro Thr Ser Ala Lys Gin Met Glu Ala Ala Leu Ser Ser Lys 

165 170 175 

Glu Asn Tyr Gin Lys Val Leu Ser Glu His Gly Phe Gly Pro lie Thr 

180 185 190 

Thr Asp lie Arg Glu Gly Gin Thr Phe Tyr Tyr Ala Glu Asp Tyr His 

195 200 205 

Gin Gin Tyr Leu Ser Lys Asn Pro Asn Gly Tyr Cys Gly Leu Gly Gly 

210 215 220 

Thr Gly Val Ser Cys Pro Val Gly lie Lys Lys 
225 230 235 



<210> 39 

<211> 481 

<212> DNA 

<213> Homo Sapien 

<300> 

<308> GenBank AW195104 
<309> 1999-11-29 



<400> 39 

ggcattattg gactgtaggt ttttattaaa acaaacattt ctcatagctc taagcaaagc 60 

attagaattc atcaagcgga ctcacatctt ttctctgcac agagaggggc tgaaaaggga 120 

gagaaagtcc cttatgtatg tctagatttg gtaaagcgaa ggatttcagc gaatgagtca 180 

ctgaggctat acacgtttgc aaattgtaag gcactggcgg gcagagagca cagataaagg 240 

acttctgggg tcccccatcc tgtccagcaa cctcccagct cacaccttag cttctaccaa 300 

gaagggtgaa cacagcatcc ctgctatctt cactcagacc ccagaaaacc cagggaaacc 360 

cgacagctcc actcccacca taacttatta ggagataagt cacattttat caacttgcca 420 
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tcgcgcctcc tatagattat acttcggtaa acccaatctg tataaattcc tttgtacttt 480 
9 481 

<210> 40 

<211> 390 

<212> DNA 

<213> Homo Sapien 

<300> 

<308> GenBank AW874187 
<309> 2000-05-22 

<400> 40 

ttttttttat tggactgtag gtttttatta aaacaaacat ttctcatagc tctaagcaaa SO 

gcattagaat tcatcaagcg gactcacatc ttttctctgc acagagaggg ctgaaaaggg 120 

agagaaagcc ccttatgtat gtctagattt ggtaaagcga aggatttcag cgaatgagtc 180 

actgaggcta tacacgtttg caaattgtaa ggcactggcg ggcagagagc acagataaag 240 

gacttttggg ggtcccccat tcctgtccag caacctccca gctcacacct tagcttctac 300 

caagaagggg tgaacacagc atccctgcta tcttcactca gacccccaga agacacagga 360 

aaccgcacag ctccactccc accataactt 3 90 

<210> 41 
<211> 43 
<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> Oligonucleotide Primer 
<400> 41 

agcggataac aatttcacac agggagctag cttggaagat tgc 43 

<210> 42 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide Primer 
<400> 42 

gtccaatata tgcaaacagt tg 22 

<210> 43 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide Primer 
<400> 43 

agcggataac aatttcacac agg 23 

<210> 44 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 
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<223> Oligonucleotide Primer 
<400> 44 

actgagcctg ctgcataa 18 

<210> 45 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> Oligonucleotide Printer 
<400> 4S 

tctcaatcat gtgcattgag g 21 

<210> 46 
c211> 43 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide Primer 
<400> 46 

agcggataac aatttcacac agggatcaca cagccatcag cag 43 

<210> 47 
<211> 23 
<212> DNA 

<213> oligonucleotide primer 
<400> 47 

agcggataac aatttcacac agg 23 

<210> 46 
<211> 16 
<212> DNA 

<213> Oligonucleotide primer 
<400> 46 

ctggcgccac gtggtcaa 18 

<210> 49 

<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide Primer 

<400> 49 

tttctctgca cagagagggc 20 

<210> 50 

<211> 44 

<212> DNA 

<213> Artificial Sequence 
<220> 
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<223> Oligonucleotide Primer 
<400> 50 

agcggataac aatttcacac agggctgaaa tccttcgctt tacc 44 

<210> 51 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide Primer 
<400> 51 

agcggataac aatttcacac agg 23 

<210> 52 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide Primer 
<400> 52 

ctgaaaaggg agagaaag 18 

<210> 53 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide Primer 
<400> 53 

tcccaaagtg ctggaattac 20 

<210> 54 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide Primer 
<400> 54 

gtccaatata tgcaaacagt tg 22 

<210> 55 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide Primer 
<400> 55 

cccacagcag ttaatccttc 20 



SDOCID <WO 01P7857A2 ! > 



WO 01/27857 PCT/USOO/28413 



112/122 

<210> 56 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 56 

gcgctcctgt cggtgcca 18 

<210> 57 

<211> 18 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 

<400> 57 

gcctgactgg tggggccc X8 

<210> 58 
<211> 15 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 58 

catgcatgca cggtc 15 

<210> 59 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 59 

cagagagtac ccctcgaccg tgcatgcatg 30 

<210> 60 
<211> 15 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 60 

catgcatgca cggtt 15 

<210> 61 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<22 3> Oligonucleotide primer 
<400> 61 

gtacgtacgt gccaactccc catgagagac 3 0 

<210> 62 
<211> 14 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 62 

catgcatgca cggt 14 

<210> 63 
<211> IB 
<212> DMA 

<213> Artificial Sequence 
<220> 

<22 3> Oligonucleotide primer 
<400> 63 

gcctgactgg tggggccc 18 

<210> 64 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 64 

gtgctgcagg tgtaaacttg taccag 26 

<210> 65 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 65 

cacggatccg gtagcagcgg tagagttg 28 

<210> 66 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 66 

actgggcatg tggagacag 19 
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<210> 67 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 67 

gcactttctt gccatgag 18 

<210> 68 
<211> 14 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 68 

tcagtcacga cgtt 14 

<210> 69 
<211> 14 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 69 

cggataacaa tttc 14 

<210> 70 
<211> 37 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 70 

caatttcatc gctggatgca atctgggcta tgagatc 37 

<210> 71 
<211> 37 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 71 

caatttcaca cagcggatgc ttcttttggc tctgact 37 

<210> 72 
<211> 40 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Oligonucleotide primer 
<400> 72 

tcagtcacga cgttggatgc caataaaagt gactctcagc 40 

<210> 73 
<211> 37 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 73 

cggataacaa tttcggatgc actgggagca ttgaggc 37 

<210> 74 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 74 

tcagtcacga cgttggatga gcagatccct ggacaggc 38 

<210> 75 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 75 

cggataacaa tttcggatgg acaaaatacc tgtattcc 38 

<210> 76 
<211> 36 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 76 

tcagtcacga cgttggatgc agagcagctc cgagtc 36 



<210> 77 
<211> 32 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 77 

cagcggtgat cattggatgc aggaagctct gg 



32 
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<210> 78 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 78 

tcagtcacga cgttggatgc ccacatgcca cccactac 3 8 

<210> 79 
<211> 35 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 79 

cggataacaa tttcggatgc ccgtcaggta ccacg 35 

<210> 80 
<211> 37 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 80 

tcagtcacga cgttggatgc ccacagtgga gcttcag 37 

<210> 81 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 81 

gctcatacct tgcaggatga eg 22 

<210> 82 
<211> 36 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 82 

tcagtcacga cgttggatga ccagctgttc gtgttc 36 

<210> 83 
<211> 34 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Oligonucleotide primer 
<400> 83 

tacatggagt tcggggatgc acacggcgac tctc 34 

<210> 84 
<211> 40 
c212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 84 

tcagtcacga cgttggatgg ggaagagcag agatatacgt 40 

<210> 85 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 85 

gaggggctga tccaggatgg gtgctccac 29 

<210> 86 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 86 

tgaagcactt gaaggatgag ggtgtctgcg 30 

<210> 87 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 87 

cggataacaa tttcggatgc tgcgtgatga tgaaatcg 38 

<210> 88 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 88 

gatgaagctc ccaggatgcc agaggc 26 
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<210> 89 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 89 

gccgccggtg taggatgctg ctggtgc 27 

<210> 90 
<211> 31 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide Template 
<400> 90 

cgcagggttt cc tcgtcgca ctgggcatgt g 31 

<210> 91 
<211> 43 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Biotinylatd primer 
<400> 91 

tgcttatccc tgtagctacc ctgtcttggc cttgcagatc caa 43 

<210> 92 
<211> 42 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 92 

agcggataac aatttcacac aggccatcac accgcggtac tg 42 

<210> 93 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 93 

cccagtcacg acgttgtaaa acgtcttggc cttgcagatc caag 44 

<210> 94 
<211> 42 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Oligonucleotide primer 
<400> 94 

agcggataac aatttcacac aggccatcac accgcggtac tg 42 

<210> 95 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 95 

ctccagctgg gcaggagtgc 20 

<210> 96 

<211> 17 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 

<400> 96 

cacttcagtc gctccct 17 

<210> 97 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Biotinylated primer 
<400> 97 

cccagtcacg acgttgtaaa acg 23 

<210> 98 

<211> 100 

<212> DNA 

<213> Homo sapien 

<400> 98 

cctttgagaa agggctctgc ttgagttgta gaaagaaccg ctgcaacaat ctgggctatg 60 
agatcaataa agtcagagcc aaaagaagca gcaaaatgta 100 

<210> 99 

<211> 100 

<212> DNA 

<213> Homo sapien 

<400> 99 

cctttgagaa agggctctgc ttgagttgta gaaagaaccg ctgcaacaat ctgggctatg 60 
agatcagtaa agtcagagcc aaaagaagca gcaaaatgta 100 

<210> 100 
<211> 100 
<212> DNA 
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<213> Homo sapien 
<400> 100 

gaattatttt tgtgtttcta aaactatggt tcccaataaa agtgactctc agcgagcctc 60 
aatgctccca gtgctattca tgggcagctc tctgggctca 100 

<210> 101 
<211> 100 
<212> DNA 
<213> Homo sapien 

<400> 101 

gaattatttt tgtgtttcta aaactatggt tcccaataaa agtgactctc agcaagcctc 60 
aatgctccca gtgctattca tgggcagctc tctgggctca 100 

<210> 102 

<211> 84 

<212> DNA 

<213> Homo sapien 

<400> 102 

taataggact acttctaatc tgtaagagca gatccctgga caggcgagga atacaggtat 60 
tttgtccttg aagtaacctt tcag g 4 

<210> 103 

<211> 84 

<212> DNA 

<213> Homo sapien 

<400> 103 

taataggact acttctaatc tgtaagagca gatccctgga caggcaagga atacaggtat 60 
tttgtccttg aagtaacctt tcag 84 

<210> 104 

<211> 100 

<212> DNA 

<213> Homo sapien 

<400> 104 

ctcaccatgg gcatttgatt gcagagcagc tccgagtccg tccagagctt cctgcagtca 60 
atgatcaccg ctgtgggcat ccctgaggtc atgtctcgta 100 

<210> 105 
<211> 100 
<212> DNA 
<213> Homo sapien 

<400> 105 

ctcaccatgg gcatttgatt gcagagcagc tccgagtcca tccagagctt cctgcagtca 60 
atgatcaccg ctgtgggcat ccctgaggtc atgtctcgta 100 

<210> 106 
<211> 100 
<212> DNA 
<213> Homo sapien 

<400> 106 

agcaaggact cctgcaaggg ggacagtgga ggcccacatg ccacccacta ccagggcacg 60 
tggtacctga cgggcatcgt cagctggggc cagggctgcg 100 
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<210> 107 
<211> 100 
<212> DNA 
<213> Homo sapien 

<400> 107 

agcaaggact cctgcaaggg ggacagtgga ggcccacatg ccacccacta ccggggcacg 60 
tggtacctga cgggcatcgt cagctggggc cagggctgcg 100 

<210> 108 

<211> 100 

<212> DNA 

<213> Horn sapien 

<400> 108 

caataactct aatgcagcgg aagatgacct gcccacagtg gagcttcagg gcgtggtgcc 60 
ccggggcgtc aacctgcaag gtatgagcat accccccttc 100 

<210> 109 
<211> 100 
<212> DNA 
<213> Homo sapien 

<400> 109 

caataactct aatgcagcgg aagatgacct gcccacagtg gagcttcagg gcttggtgcc 60 
ccggggcgtc aacctgcaag gtatgagcat accccccttc 100 

<210> 110 
<211> 100 
<212> DNA 
<213> Homo sapien 

<400> 110 

ttgaagcttt gggctacgtg gatgaccagc tgttcgtgtt ctatgatcat gagagtcgcc 60 
gtgtggagcc ccgaactcca tgggtttcca gtagaatttc 100 

<210> 111 
<211> 100 
<212> DNA 
<2 13 > Homo sapien 

<400> 111 

ttgaagcttt gggctacgtg gatgaccagc tgttcgtgtt ctatgatgat gagagtcgcc 60 
gtgtggagcc ccgaactcca tgggtttcca gtagaatttc 100 

<210> 112 
<211> 100 
<212> DNA 
<213> Homo sapien 

<400> 112 

ggataacctt ggctgtaccc cctggggaag agcagagata tacgtgccag gtggagcacc 60 
caggcctgga tcagcccctc attgtgatct gggagccctc 100 

<210> 113 
<211> 100 
<212> DNA 
<213> Homo sapien 



<400> 113 
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ggataacctt ggctgtaccc cctggggaag agcagagata tacgtaccag gtggagcacc 
caggcctgga tcagcccctc attgtgatct gggagccctc 



60 
100 



<210> 114 

<211> 80 

<212> DNA 

<213> Homo sapien 

<400> 114 

tgaagcactt gaaggagaag gtgtctgcgg gagccgattt catcatcacg cagcttttct 60 
ttgaggctga cacattcttc 80 

<210> 115 

<211> 80 

<212» DNA 

<213> Homo sapien 

<400> US 

tgaagcactt gaaggagaag gtgtctgcgg gagtcgattt catcatcacg cagcttttct 60 
ttga^jctga cacattcttc 80 

<210> 116 

<211> 80 

<212> DNA 

<213> Homo sapien 

<400> 116 

tccagatgaa gctcccagaa tgccagaggc tgctccccgc gtggcccctg caccagcagc 60 
tcctacaccg gcggcccctg 80 

<210> 117 

<211> 80 

<212> DNA 

<213> Homo sapien 

<400> 117 

tccagatgaa gctcccagaa tgccagaggc tgctcccccc gtggcccctg caccagcagc 60 
tcctacaccg gcggcccctg 80 

<210> 118 
<211> 48 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Hair pin structure 
<400> 118 

cagagagtac ccctcaaccg tgcatgcatg aaacatgcat gcacggtt 4 8 
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METHODS FOR GENERATING DATABASES AND DATABASES FOR 
IDENTIFYING POLYMORPHIC GENETIC MARKERS 

RELATED APPLICATIONS 

Benefit of priority to the following applications is claimed herein: 
5 U.S. provisional application Serial No. 60/217,658 to Andreas Braun, Hubert 
Koster; Dirk Van den Boom, filed July 10, entitled "METHODS FOR 
GENERATING DATABASES AND DATABASES FOR IDENTIFYING 
POLYMORPHIC GENETIC MARKERS"; U.S. provisional application Serial No. 
60/159,176 to Andreas Braun, Hubert Koster, Dirk Van den Boom, filed October 

10 13, 1999, entitled "METHODS FOR GENERATING DATABASES AND 

DATABASES FOR IDENTIFYING POLYMORPHIC GENETIC MARKERS"; U.S. 
provisional application SerialNo. 60/217,251, filed July 10, 2000, to Andreas 
Braun, entitled "POLYMORPHIC KINASE ANCHOR PROTEIN GENE SEQUENCES, 
POLYMORPHIC KINASE ANCHOR PROTEINS AND METHODS OF DETECTING 

15 POLYMORPHIC KINASE ANCHOR PROTEINS AND NUCLEIC ACIDS ENCODING 
THE SAME"; and U.S. application Serial No. 09/663,968, to Ping Yip, filed 
September 19, 2000, entitled "METHOD AND DEVICE FOR IDENTIFYING A 
BIOLOGICAL SAMPLE." 

Where permitted that above-noted applications and provisional 

20 applications are incorporated by reference in their entirety. 
FIELD OF THE INVENTION 

Process and methods for creating a database of genomic samples from 
healthy human donors. Methods that use the database to identify and correlate 
with polymorphic genetic markers and other markers with diseases and 

25 conditions are provided. 
BACKGROUND 

Diseases in all organisms have a genetic component, whether inherited or 
resulting from the body's response to environmental stresses, such as viruses 
and toxins. The ultimate goal of ongoing genomic research is to use this 
30 information to develop new ways to identify, treat and potentially cure these 
diseases. The first step has been to screen disease tissue and identify genomic 
changes at the level of individual samples. The identification of these "disease" 
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markers has then fueled the development and commercialization of diagnostic 
tests that detect these errant genes or polymorphisms. With the increasing 
numbers of genetic markers, including single nucleotide polymorphisms (SNPs), 
microsatellites, tandem repeats, newly mapped introns and exons, the challenge 
5 to the medical and pharmaceutical communities is to identify genotypes which 
not only identify the disease but also follow the progression of the disease and 
are predictive of an organism's response to treatment. 

Currently the pharmaceutical and biotechnology industries find a disease 
and then attempt to determine the genomic basis for the disease. This approach 
10 is time consuming and expensive and in many cases involves the investigator 
guessing as to what pathways might be involved in the disease. 
Genomics 

Presently the two main strategies employed in analyzing the available 
genomic information are the technology driven reverse genetics brute force 
1 5 strategy and the knowledge-based pathway oriented forward genetics strategy. 
The brute force approach yields large databases of sequence information but 
little information about the medical or other uses of the sequence information. 
Hence this strategy yields intangible products of questionable value. The 
knowledge-based strategy yields small databases that contain a lot of 
20 information about medical uses of particular DNA sequences and other products 
in the pathway and yield tangible products with a high value. 

Polymorphisms 

Polymorphisms have been known since 1901 with the identification of 
blood types. In the 1 950's they were identified on the level of proteins using 

25 large population genetic studies. In the 1980's and 1990's many of the known 
protein polymorphisms were correlated with genetic loci on genomic DNA. For 
example, the gene dose of the apolipoprotein E type 4 allele was correlated with 
the risk of Alzheimer's disease in late onset families (see, e.g., Corder et at. 
(1993) Science 267: 921-923; mutation in blood coagulation factor V was 

30 associated with resistance to activated protein C (see, e.g., Bertina eta/. (1994) 
Nature 355:64-67); resistance to HIV-1 infection has been shown in Caucasian 
individuals bearing mutant alleles of the CCR-5 chemokine receptor gene (s e, 
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e.g., Samson et a/. (1996) Nature 382:722-725); and a hypermutable tract in 
antigen presenting cells (APC, such as macrophages), has been identified in 
familial colorectal cancer in individuals of Ashkenzi jewish background (see, e.g., 
Laken et at. (1997) Nature Genet. 77:79-83). There may be more than three 
5 million polymorphic sites in the human genome. Many have been identified, but 
not yet characterized or mapped or associated with a marker. 
Single nucleotide polymorphisms ISNPs) 

Much of the focus of genomics has been in the identification of SNPs, 
which are important for a variety of reasons. They allow indirect testing 
10 (association of haplotypes) and direct testing (functional variants). They are the 
most abundant and stable genetic markers. Common diseases are best 
explained by common genetic alterations, and the natural variation in the human 
population aids in understanding disease, therapy and environmental 
interactions. 

15 Currently, the only available method to identify SNPs in DNA is by 

sequencing, which is expensive, difficult and laborious. Furthermore, once a 
SNP is discovered it must be validated to determine if it is a real polymorphism 
and not a sequencing error. Also, discovered SNPs must then be evaluated to 
determine if they are associated with a particular phenotype. Thus, there is a 

20 need to develop new paradigms for identifying the genomic basis for disease and 
markers thereof. Therefore, it is an object herein to provide methods for 
identifying the genomic basis of disease and markers thereof. 
SUMMARY 

Databases and methods using the databases are provided herein. The 
25 databases comprise sets of parameters associated with subjects in populations 
selected only on the basis of being healthy (i.e., where the subjects are 
mammals, such as humans, they are selected based upon apparent health and 
no detectable infections). The databases can be sorted based upon one or more 
of the selected parameters. 
30 The databases are preferably relational databases, in which an index that 

represents each subject serves to relate parameters, which are the data, such as 
age, ethnicity, sex, medical history, etc. and ultimately genotypic information, 
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that was inputted into and stored in the database. The database can then be 
sorted according to these parameters, initially, the parameter information is 
obtained from a questionnaire answered by each subject from whom a body 
tissue or body fluid sample is obtained. As additional information about each 
5 sample is obtained, this information can be entered into the database and can 
serve as a sorting parameter. 

The databases obtained from healthy individuals have numerous uses, 
such as correlating known polymorphisms with a phenotype or disease. The 
databases can be used to identify alleles that are deleterious, that are beneficial, 
10 and that are correlated with diseases. 

For purposes herein, genotypic information can be obtained by any 
method known to those of skill in the art, but is preferably obtained using mass 
spectrometry. 

Also provided herein, is a new use for existing databases of subjects and 

15 genotypic and other parameters, such as age, ethnicity, race, and gender. Any 
database can be sorted according to the methods herein and alleles that exhibit 
statistically significant correlations with any of the sorting parameters can be 
identified. It is noted, however, is noted, that the databases provided herein and 
randomly selected databases will perform better in these methods, since dtsease- 

20 based databases suffer numerous limitations, including their relatively small size, 
the homogeneity of the selected disease population, and the masking effect of 
the polymorphism associated with the markers for which the database was 
selected. Hence, the healthy database provided herein, provides advantages not 
heretofore recognized or exploited. However, the methods provided herein can 

25 be used with a selected database, including disease-based databases, with or 
without sorting for the discovery and correlation of polymorphisms. In addition, 
the databases provided herein represent a greater genetic diversity than the 
unselected databases typically utilized for the discovery of polymorphisms and 
thus allow for the enhanced discovery and correlation of polymorphisms. 

30 The databases provided herein can be used for taking an identified 

polymorphism, and ascertaining whether it changes in frequency when the data 
is sorted according to a selected parameter. 
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One use of these methods is correlating a selected marker with a 
particular parameter by following the occurrence of known genetic markers and 
then, having made this correlation, determining or identifying correlations with 
diseases. Examples of this use are p53 and Lipoprotein Lipase polymorphism. 
5 As exemplified herein, known markers are shown to have particular correlation 
with certain groups, such as a particular ethnicity or race or one sex. Such 
correlations will then permit development of better diagnostic tests and 
treatment regimens. 

These methods are valuable for identifying one or more genetic markers 

10 whose frequency changes within the population as a function of age, ethnic 

group, sex or some other criteria. This can allow the identification of previously 
unknown polymorphisms and ultimately a gene or pathway involved in the onset 
and progression of disease. 

The databases and methods provided herein permit, among other things, 

15 identification of components, particularly key components, of a disease process 
by understanding its genetic underpinnings and also permit an understanding of 
processes, such as individual drug responses. The databases and methods 
provided herein also can be used in methods involving elucidation of pathological 
pathways, in developing new diagnostic assays, identifying new potential drug 

20 targets, and in identifying new drug candidates. 

The methods and databases can be used with experimental procedures, 
including, but are not limited to, in silico SNP identification, in vitro SNP 
identification/verification, genetic profiling of large populations, and in 
biostatistical analyses and interpretations. 

25 Also provided herein, are combinations that contain a database provided 

herein and a biological sample from a subject in the database, and preferably 
biological samples from all subjects or a plurality of subjects in the database. 
Collections of the tissue and body fluid samples are also provided. 

Also, provided herein, are methods for determining a genetic marker that 

30 correlates with age, comprising identifying a polymorphism and determining the 
frequency of the polymorphism with increasing age in a healthy population. 
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Further provided herein are methods for determining whether a genetic 
marker correlates with susceptibility to morbidity, early mortality, or morbidity 
and early mortality, comprising identifying a polymorphism and determining the 
frequency of the polymorphism with increasing age in a healthy population. 
5 Any of the methods herein described can be used out in a multiplex 

format. 

Also provided are an apparatus and process for accurately identifying 
genetic information. It is another object of the herein that genetic information be 
extracted from genetic data in a highly automated manner. Therefore, to 
10 overcome the deficiencies in the known conventional systems, a method and 
apparatus for identifying a biological sample is proposed. 

Briefly, the method and system for identifying a biological sample 
generates a data set indicative of the composition of the biological sample. In a 
particular example, the data set is DNA spectrometry data received from a mass 

15 spectrometer. The data set is denoised, and a baseline is deleted. Since 

possible compositions of the biological sample may be known, expected peak 
areas may be determined. Using the expected peak areas, a residual baseline is 
generated to further correct the data set. Probable peaks are then identifiable in 
the corrected data set, which are used to identify the composition of the 

20 biological sample. In a disclosed example, statistical methods are employed to 
determine the probability that a probable peak is an actual peak, not an actual 
peak, or that the data too inconclusive to call. 

Advantageously, the method and system for identifying a biological 
sample accurately makes composition calls in a highly automated manner. In 

25 such a manner, complete SNP profile information, for example, may be collected 
efficiently. More importantly, the collected data is analyzed with highly accurate 
results. For example, when a particular composition is called, the result may be 
relied upon with great confidence. Such confidence is provided by the robust 
computational process employed . 

30 DESCRIPTION OF THE DRAWINGS 
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Figure 1 depicts an exemplary sample bank. Panel 1 shows the samples 
as a function of sex and ethnicity. Panel 2 shows the Caucasians as a function 
of age. Panel 3 shows the Hispanics as a function of age. 

Figures 2 A and 2C show an age- and sex-distribution of the 29 IS allele 
5 of the lipoprotein lipase gene in which a total of 436 males and 589 females 
were investigated. Figure 2B shows an age distribution for the 436 males. 

Figure 3 is an exemplary questionnaire for population-based sample 
banking. 

Figure 4 depicts processing and tracking of blood sample components. 
10 Figure 5 depicts the allelic frequency of "sick" alleles and "healthy" 

alleles as a function of age. It is noted that the relative frequency of healthy 
alleles increases in a population with increasing age. 

Figure 6 depicts the age-dependent distribution of ApoE genotypes (see, 
Schachter et al. (1994) Nature Genetics 6:29-32). 
15 Figure 7A-D depicts age-related and genotype frequency of the p53 

(tumor suppressor) codon 72 among the Caucasian population in the database. 
*R72 and *P72 represent the frequency of the allele in the database population. 
R72, R72P, and P72 represent the genotypes of the individuals iri the population. 
The frequency of the homozygous P72 allele drops from 6.7% to 3.7% with 
20 age. 

Figure 8 depicts the allele and genotype frequencies of the p21 S31R 
allele as a function of age. 

Figure 9 depicts the frequency of the FVM Allele 353Q in pooled versus 
individual samples. 

25 Figure 1 0 depicts the frequency of the CETP (cholesterol ester transfer 

protein) allele in pooled versus individual samples 

Figure 1 1 depicts the frequency of the plasminogen activator inhibitor-1 
(PAI-1) 5G in pooled versus individual samples 

Figure 1 2 shows mass spectra of the samples and the ethnic diversity of 
30 the PAI-1 alleles. 

Figure 1 3 shows mass spectra of the samples and the ethnic diversity of 
the CETP 405 alleles. 
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Figure 1 4 shows mass spectra of the samples and the ethnic diversity of 
the Factor VJI 353 alleles. 

Figure 15 shows ethnic diversity of PAM , CETP and Factor VII using the 
pooled DNA samples. 
5 Figure 16 shows the p53-Rb pathway and the relationships among the 

various factors in the pathway. 

Figure 1 7, which is a block diagram of a computer constructed to provide 
and process the databases described herein, depicts a typical computer system 
for storing and sorting the databases provided herein and practicing the methods 
10 provided herein. 

Figure 18 is a flow diagram that illustrates the processing steps 
performed using the computer illustrated in Figure 1 7, to maintain and provide 
access to the databases for identifying polymorphic genetic markers. 

Figure 1 9 is a histogram showing the allele and genotype distribution in 
15 the age and sex stratified Caucasian population for the AKAP10-1 locus. Bright 
green bars show frequencies in individuals younger than 40 years. Dark green 
bars show frequencies in individuals older than 60 years. 

Figure 20 is a histogram showing the allele and genotype distribution in 
the age and sex stratified Caucasian population for the AKAP10-5 locus. Bright 
20 green bars show frequencies in individuals younger than 40 years; dark green 
bars show frequencies in individuals older than 60 years. 

Figure 21 is a histogram showing the allele and genotype distribution in 
the age and sex stratified Caucasian population for the h-msrA locus. Genotype 
difference between male age groups is significant. Bright green bars show 
25 frequencies in individuals younger than 40 years. Dark green bars show 
frequencies in individuals older than 60 years. 

Figure 22A-D is a sample data collection questionnaire used for the 
healthy database. 

Figure 23 is a flowchart showing processing performed by the computing 
30 device of Figure 24 when p rforming genotyping of sense strands and antisense 
strands from assay fragments. 
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Figure 24 is a block diagram showing a system in accordance with the 
present invention; 

Figure 25 is a flowchart of a method of identifying a biological sample in 
accordance with the present invention; 
5 Figure 26 is a graphical representation of data from a mass spectrometer; 

Figure 27 is a diagram of wavelet transformation of mass spectrometry 

data; 

Figure 28 is a graphical representation of wavelet stage O hi data; 
Figure 29 is a graphical representation of stage 0 noise profile; 
10 Figure 30 is a graphical representation of generating stage noise standard 

deviations; 

Figure 31 is a graphical representation of applying a threshold to data 

stages; 

Figure 32 is a graphical representation of a sparse data set; 
15 Figure 33 is a formula for signal shifting; 

Figure 34 is a graphical representation of a wavelet transformation of a 
denoised and shifted signal- 
Figure 35 is a graphical representation of a denoised and shifted signal- 
Figure 36 is a graphical representation of removing peak sections; 
20 Figure 37 is a graphical representation of generating a peak free signal ; 

Figure 38 is a block diagram of a method of generating a baseline 
correction; 

Figure 39 is a graphical representation of a baseline and signal- 
Figure 40 is a graphical representation of a signal with baseline removed; 

25 Figure 41 is a table showing compressed data; 

Figure 42 is a flowchart of method for compressing data; 
Figure 43 is a graphical representation of mass shifting; 
Figure 44 is a graphical representation of determining peak width; 
Figure 45 is a graphical representation of removing peaks; 

30 Figure 46 is a graphical representation of a signal with peaks removed; 

Figure 47 is a graphical representation of a residual baseline; 
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Figure 48 is a graphical representation of a signal with residual baseline 
removed; 

Figure 49 is a graphical representation of determining peak heights- 
Figure 50 is a graphical representation of determining signal-to-noise for 
5 each peak; 

Figure 51 is a graphical representation of determining a residual error for 
each peak; 

Figure 52 is a graphical representation of peak probabilities; 
Figure 53 is a graphical representation of applying an allelic ratio to peak 
10 probability; 

Figure 54 is a graphical representation of determining peak probability 
Figure 55 is a graphical representation of calling a genotype; 
Figure 56 is a flowchart showing a statistical procedure for calling a 
genotype; 

15 Figure 57 is a flowchart showing processing performed by the computing 

device of Figure 1 when performing standardless genotyping; and 

Figure 58 is graphical representation of applying an allelic ratio to peak 
probability for standardless genotype processing. 
DETAILED DESCRIPTION 

20 Definitions 

Unless defined otherwise, all technical and scientific terms used herein 
have the same meaning as is commonly understood by one of ordinary skill in 
the art to which this invention belongs. All patents, applications, published 
applications and other publications and sequences from GenBank and other 

25 databases referred to herein throughout the disclosure are incorporated by 
reference in their entirety. 

As used herein, a biopolymer includes, but is not limited to, nucleic acid, 
proteins, polysaccharides, lipids and other macromolecules. Nucleic acids 
include DNA, RNA, and fragments thereof. Nucleic acids may be derived from 

30 genomic DNA, RNA, mitochondrial nucl ic acid, chloroplast nucleic acid and 
other organelles with separate genetic material. 
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As used herein, morbidity refers to conditions, such as diseases or 
disorders, that compromise the health and well-being of an organism, such as an 
animal. Morbidity susceptibility or morbidity-associated genes are genes that, 
when altered, for example, by a variation in nucleotide sequence, facilitate the 
5 expression of a specific disease clinical phenotype. Thus, morbidity 

susceptibility genes have the potential, upon alteration, of increasing the 
likelihood or general risk that an organism will develop a specific disease. 

As used herein, mortality refers to the statistical likelihood that an 
organism, particularly an animal, will not survive a full predicted lifespan. 
10 Hence, a trait or a marker, such as a polymorphism, associated with increased 
mortality is observed at a lower frequency in older than younger segments of a 
population. 

As used herein, a polymorphism, e.g. genetic variation, refers to a 
variation in the sequence of a gene in the genome amongst a population, such as 

15 allelic variations and other variations that arise or are observed. Thus, a 

polymorphism refers to the occurrence of two or more genetically determined 
alternative sequences or alleles in a population. These differences can occur in 
coding and non-coding portions of the genome, and can be manifested or 
detected as differences in nucleic acid sequences, gene expression, including, 

20 for example transcription, processing, translation, transport, protein processing, 
trafficking, DNA synthesis, expressed proteins, other gene products or products 
of biochemical pathways or in post-translational modifications and any other 
differences manifested amongst members of a population. A single nucleotide 
polymorphism (SNP) refers to a polymorphism that arises as the result of a single 

25 base change, such as an insertion, deletion or change in a base. 

A polymorphic marker or site is the locus at which divergence occurs. 
Such site may be as small as one base pair (an SNP). Polymorphic markers 
include, but are not limited to, restriction fragment length polymorphisms, 
variable number of tandem repeats (VNTR's), hypervariable regions, 

30 minisateliites, dinucleotide repeats, trinucleotide repeats, tetranucleotide repeats 
and other repeating patterns, simple sequence repeats and insertional elements, 
such as Alu. Polymorphic forms also are manifested as different mendelian 
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alleles for a gene. Polymorphisms may be observed by differences in proteins, 
protein modifications, RNA expression modification, DNA and RNA methylation, 
regulatory factors that alter gene expression and DNA replication, and any other 
manifestation of alterations in genomic nucleic acid or organelle nucleic acids. 
5 As used herein, a healthy population, refers to a population of organisms, 

including but are not limited to, animals, bacteria, viruses, parasites, plants, 
eubacteria, and others, that are disease free. The concept of disease-free is a 
function of the selected organism. For example, for mammals it refers to a 
subject not manifesting any disease state. Practically a healthy subject, when 

10 human, is defined as human donor who passes blood bank criteria to donate 
blood for eventual use in the general population. These criteria are as follows: 
free of detectable viral, bacterial, mycoplasma, and parasitic infections; not 
anemic; and then further selected based upon a questionnaire regarding history 
(see Figure 3). Thus, a healthy population represents an unbiased population of 

15 sufficient health to donate blood according to blood bank criteria, and not further 
selected for any disease state. Typically such individuals are not taking any 
medications. For plants, for example, it is a plant population that does not 
manifest diseases pathology associated with plants. For bacteria it is a bacterial 
population replicating without environmental stress, such as selective agents, 

20 heat and other pathogens. 

As used herein, a healthy database (or healthy patient database) refers to 
a database of profiles of subjects that have not been pre-selected for any 
particular disease. Hence, the subjects that serve as the source of data for the 
database are selected, according to predetermined criteria, to be healthy. In 

25 contrast to other such databases that have been pre-selected for subjects with a 
particular disease or other characteristic, the subjects for the database provided 
herein are not so-selected. Also, if the subjects do manifest a disease or other 
condition, any polymorphism discovered or characterized should be related to an 
independent disease or condition. In a preferred embodiment, where the 

30 subjects are human, a healthy subject manifests no disease symptoms and 
meets criteria, such as those set by blood banks for blood donors. 
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Thus, the subjects for the database are a population of any organism, 
including, but are not limited to, animals, plants, bacteria, viruses, parasites and 
any other organism or entity that has nucleic acid. Among preferred subjects are 
mammals, preferably, although not necessarily, humans. Such a database can 
5 capture the diversity of the a population, thus providing for discovery of rare 
polymorphisms. 

As used herein, a profile refers to information relating to, but not limited 
to and not necessarily including all of, age, sex, ethnicity, disease history, family 
history, phenotypic characteristics, such as height and weight and other relevant 

10 parameters. A sample collect information form is shown in Figure 22, which 
illustrates profile intent. 

As used herein, a disease state is a condition or abnormality or disorder 
that may be inherited or result from environmental stresses, such as toxins, 
bacterial, fungal and viral infections. 

15 As used herein, set of non-selected subjects means that the subjects 

have not been pre-selected to share a common disease or other characteristic. 
They can be selected to be healthy as defined herein. 

As used herein, a phenotype refers to a set of parameters that includes 
any distinguishable trait of an organism. A phenotype can be physical traits and 

20 can be, in instances in which the subject is an animal, a mental trait, such as 

emotional traits. Some phenotypes can be determined by observation elicited by 
questionnaires {see, e.g., Figures 3 and 22) or by referring to prior medical and 
other records. For purposes herein, a phenotype is a parameter around which 
the database can be sorted. 

25 As used herein, a parameter is any input data that will serve as a basis 

for sorting the database. These parameters will include phenotypic traits, 
medical histories, family histories and any other such information elicited from a 
subject or observed about the subject. A parameter may describe the subject, 
some historical or current environmental or social influence experienced by the 

30 subject, or a condition or nvironmental influence on someone related to th 
subject. Paramaters include, but are not limited to, any of those described 
herein, and known to those of skill in the art. 
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As used herein, haplotype refers to two or more polymorphisms located 
on a single DNA strand. Hence, haplotyping refers to identification of two or 
more polymorphisms on a single DNA strand. Haplotypes can be indicative of a 
phenotype. For some disorders a single polymorphism may suffice to indicate a 
5 trait; for others a plurality (i.e., a haplotype) may be needed. Haplotyping can be 
performed by isolating nucleic acid and separating the strands, in addition, 
when using enzymes such a certain nucleases, that produce, different size 
fragments from each strand, strand separation is not needed for haplotyping. 

As used herein, used herein, pattern with reference to a mass spectrum 
10 or mass spectrometry analyses, refers to a characteristic distribution and 
number of signals (such peaks or digital representations thereof). 

As used herein, signal in the context of a mass spectrum and analysis 
thereof refers to the output data, which the number or relative number of 
moleucles having a particular mass. Signals include "peaks" and digital 
15 representations thereof. 

As used herein, adaptor, when used with reference to haplotyping using 
Fen ligase, refers to a nucleic acid that specifically hybridizes to a polymorphism 
of interest. An adaptor can be partially double-stranded. An adaptor complex is 
formed when an adaptor hybridizes to its target. 
20 As used herein, a target nucleic acid refers to any nucleic acid of interest 

in a sample. It can contain one or more nucleotides. 

As used herein, standardless analysis refers to a determination based 
upon an internal standard. For example, the frequency of a polymorphism can be 
determined herein by comparing signals within a single mass spectrum. 
25 As used herein, amplifying refers to means for increasing the amount of a 

bipolymer, especially nucleic acids. Based on the 5' and 3' primers that are 
chosen, amplication also serves to restrict and def ine the region of the genome 
which is subject to analysis. Amplification can be done by any means known to 
those skilled in the art, including use of the polymerase chain reaction (PCR) etc. 
30 Amplification, .g., PCR must be don quantitatively when the frequency of 
polymorphism is required to be det rmined. 
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As used herein, cleaving refers to non-specific and specific fragmentation 
of a biopolymer. 

As used herein, multiplexing refers to the simultaneous detection of more 
than one polymorphism. Methods for performing multiplexed reactions, 
5 particularly in conjunction with mass spectrometry are known (see, e.g., U.S. 
Patent Nos. 6,043,031, 5,547,835 and International PCT application No. 
WO 97/37041). 

As used herein, reference to mass spectrometry encompasss any suitable 
mass spectrometric format known to those of skill in the art. Such formats 

10 cinlude, but are not limited to, Matrix-Assisted Laser Desorption/lonization, 

Time-of-Flight (MALDI-TOF), Electrospray (ES), IR-MALDI (see, e.g., published 
International PCT application No. 99/5731 8 and U.S. Patent No. 5,1 18,937), Ion 
Cyclotron Resonance (ICR), Fourier Transform and combinations thereof. 
MALDI, particular UV and IR, are among the preferred formats. 

15 As used herein, mass spectrum refers to the presentation of data 

obtained from analyzing a biopolymer or fragment thereof by mass spectrometry 
cither graphically or encoded numerically. 

As used herein, a blood component is a component that is separated from 
blood and includes, but is not limited to red blood cells and platelets, blood 

20 clotting factors, plasma, enzymes, plasminogen, immunoglobulins. A cellular 
blood component is a component of blood, such as a red blood cell, that is a 
cell. A blood protein is a protein that is normally found in blood. Examples of 
such proteins are blood factors VII and VIII. Such proteins and components are 
well-known to those of skill in the art. 

25 As used herein, plasma can be prepared by any method known to those 

of skill in the art. For example, it can be prepared by centrifuging blood at a 
force that pellets the red cells and forms an interface between the red cells and 
the buffy coat, which contains leukocytes, above which is the plasma. For 
example, typical platelet concentrates contain at least about 10% plasma. 

30 Blood may be separated into its components, including, but not limited to, 

plasma, platelets and red blood cells by any method known to those of skill in 
the art. For example, blood can be centrifuged for a sufficient time and at a 
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sufficient acceleration to form a pellet containing the red blood cells. Leukocytes 
collect primarily at the interface of the pellet and supernatant in the buffy coat 
region. The supernatant, which contains plasma, platelets, and other blood 
components, may then be removed and centrifuged at a higher acceleration, 
5 whereby the platelets pellet. 

As used herein, p53 is a cell cycle control protein that assesses DNA 
damage and acts as a transcription factor regulation gene which control cell 
growth, DNA repair and apoptosis. The p53 mutations have been found in a 
wide variety of different cancers, including all of the different types of leukemia, 

10 with varying frequency. The loss of normal p53 functions results in genomic 
instability and uncontrolled growth of the host cell. 

As used herein, p21 is a cyclin-dependent kinase inhibitor, associated 
with G1 phase arrest of normal cells. Expression triggers apoptosis or 
programmed cell death and has been associated with Wilms' tumor, a pediatric 

15 kidney cancer. 

As used herein, Factor VII is a serine protease involved the extrinsic blood 
coagulation cascade. This factor is activated by thrombin and works with tissue 
factor (Factor III) in the processing of Factor X to Factor Xa. Evidence has 
supported an association between polymorphisms in the gene and increase 

20 Factor VII activity which can result in an elevated risk of ischemic cardiovascular 
disease including myocardial infarction. 

As used herein, a relational database stores information in a form 
representative of matrices, such as two-dimensional tables, including rows and 
columns of data, or higher dimensional matrices. For example, in one 

25 embodiment, the relational database has separate tables each with a parameter. 
The tables are linked with a record number, which also acts as an index. The 
database can be searched or sorted by using data in the tables and is stored in 
any suitable storage medium, such as floppy disk, CD rom disk, hard drive or 
other suitable medium. 

30 As used herein, a bar codes refers any array of optically readable marks 

of any desired size and shape that are arranged in a reference context or frame 
of, preferably, although not necessarily, one or more columns and on or more 
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rows. For purposes herein, the bar code refers to any symbology, not necessary 
"bar" but may include dots, characters or any symbol or symbols. 

As used herein, symbology refers to an identifier code or symbol, such as 
a bar code, that is linked to a sample. The index will reference each such 
5 symbology. The symbology is any code known or designed by the user. The 
symbols are associated with information stored in the database. For example, 
each sample can be uniquely identified with an encoded symbology. The 
parameters, such as the answers to the questions and subsequent genotypic and 
other information obtained upon analysis of the samples is included in the 
10 database and associated with the symbology. The database is stored on any 
suitable recording medium, such as a hard drive, a floppy disk, a tape, a CD 
ROM, a DVD disk and any other suitable medium. 
DATABASES 

Human genotyping is currently dependent on collaborations with 
15 hospitals, tissues banks and research institutions that provide samples of disease 
tissue. This approach is based on the concept that the onset and/or progression 
of diseases can be correlated with the presence of a polymorphisms or other 
genetic markers. This approach does not consider that disease correlated with 
the presence of specific markers and the absence of specific markers. It is 
20 shown herein that identification and scoring of the appearance and 

disappearance of markers is possible only if these markers are measured in the 
background of healthy subjects where the onset of disease does not mask the 
change in polymorphism occurrence. Databases of information from disease 
populations suffer from small sample size, selection bias and heterogeneity. The 
25 databases provided herein from healthy populations solve these problems by 
permitting large sample bands, simple selection methods and diluted 
heterogeneity. 

Provided herein are first databases of parameters, associated with non- 
selected, particularly healthy, subjects. Also provided are combinations of the 
30 databases with indexed samples obtained from each of the subjects. Further 
provided are databases produced from the first databases. These contain in 
addition to the original parameters information, such as genotypic information. 



WO 01/027857 



18 



PCT/USOO/28413 



including, but are not limited to, genomic sequence information, derived from the 
samples. 

The databases, which are herein designated healthy databases, are 
so-designated because they are not obtained from subjects pre-selected for a 
5 particular disease. Hence, although individual members may have a disease, the 
collection of individuals is not selected to have a particular disease. 

The subjects from whom the parameters are obtained comprise either a 
set of subjects who are randomly selected across, preferably, all populations, or 
are pre-selected to be disease-free or healthy. As a result, the database is not 
10 selected to be representative of any pre-selected phenotype, genotype, disease 
or other characteristic. Typically the number of subjects from which the 
database is prepared is selected to produce statistically significant results when 
used in the methods provided herein. Preferably, the number of subjects will be 
greater than 1 00, more preferably greater than 200, yet more preferably greater 
15 than 1O00. The precise number can be empirically determined based upon the 
frequency of the parameter(s) that be used to sort the database. Generally the 
population can have at least 50, at least 100, at least 200, at least 500, at least 
1000, at least 5000 or at least 10,000 or more subjects. 

Upon identification of a collection of subjects, information about each 
20 subject is recorded and associated with each subject as a database. The 

information associated with each of the subjects, includes, but is not limited to, 
information related to historical characteristics of the subjects, phenotypic 
characteristics and also genotypic characteristics, medical characteristics and 
any other traits and characteristics about the subject that can be determined. 
25 This information will serve as the basis for sorting the database. 

In an exemplary embodiment, the subjects are mammals, such as 
humans, and the information relates to one or more of parameters, such as age, 
sex, medical history, ethnicity and any other factor. Such information, when the 
animals are humans, for example, can be obtained by a questionnaire, and by 
30 observations about the individual, such as hair color, eye color and other 

characteristics. Genotypic information will be obtained from tissue or other body 
and body fluid samples from the subject. 
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The healthy genomic database can include profiles and polymorphisms 
from healthy individuals from a library of blood samples where each sample in 
the library is an individual and separate blood or other tissue sample. Each 
sample in the database is profiled as to the sex, age, ethnic group, and disease 
5 history of the donor. 

The databases are generated by first identifying healthy populations of 
subjects and obtaining information about each subject that will serve as the 
sorting parameters for the database. This information is preferably entered into 
a storage medium, such as the memory of a computer. 

10 The information obtained about each subject in a population used for 

generating the database is stored in a computer memory or other suitable 
storage medium. The information is linked to an identifier associated with each 
subject. Hence the database will identify a subject, for example by a datapoint 
representative of a bar code, and then all information, such as the information 

15 from a questionnaire, regarding the individual is associated with the datapoint. 
As the information is collected the database is generated. 

Thus, for example, profile information, such as subject histories obtained 
from questionnaires, is collected in the database. The resulting database can be 
sorted as desired, using standard software, such as by age, sex and/or ethnicity. 

20 An exemplary questionnaire for subjects from whom samples are to be obtained 
is shown in Figures 22A-D. Each questionnaire preferably is identified by a bar 
code, particularly a machine readable bar code for entry into the database. After 
a subject provides data and is deemed to be healthy (i.e., meets standards for 
blood donation), the data in the questionnaire is entered into the database and is 

25 associated with the bar code. A tissue, cell or blood sample is obtained from the 
subject. 

Figure 4 exemplifies processing and tracking of blood sample 
components. Each component is tracked with a bar code, dated, is entered into 
the database and associated with the subject and the profile of the subject. 
30 Typically, the whole blood is centrifuged to produce plasma, red blood cells 

(which pellet) and leukocytes found in the buffy coat which layers in betwe n. 
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Various samples are obtained and coded with a bar code and stored for use as 
needed. 

Samples are collected from the subjects. The samples include, but are 
not limited to, tissues, cells, and fluids, such as nucleic acid, blood, plasma, 
5 amniotic fluid, synovial fluid, urine, saliva, aqueous humor, sweat, sperm 
samples and cerebral spinal fluid. It is understood that the particular set of 
samples depends upon the organisms in the population. 

Once samples are obtained the collection can be stored and, in preferred 
embodiments, each sample is indexed with an identifier, particularly a machine 
10 readable code, such as a bar code. For analyses, the samples or components of 
the samples, particularly biopolymers and small molecules, such as nucleic acids 
and/or proteins and metabolites, are isolated. 

After samples are analyzed, this information is entered into the database 
in the memory of the storage medium and associated with each subject. This 
15 information includes, but is not limited to, genotypic information. Particularly, 
nucleic acid sequence information and other information indicative of 
polymorphisms, such as masses of PCR fragments, peptide fragment sequences 
or masses, spectra of biopolymers and small molecules and other indicia of the 
structure or function of a gene, gene product or other marker from which the 
20 existence of a polymorphism within the population can be inferred. 

In an exemplary embodiment, a database can be derived from a collection 
of blood samples. For example. Figure 1 (see, also Figure 10) shows the status 
of a collection of over 5000 individual samples. The samples were processed in 
the laboratory following SOP (standard operating procedure) guidelines. Any 
25 standard blood processing protocol may be used. 

For the exemplary database described herein, the following criteria were 
used to select subjects: 

No testing is done for infectious agents. 
Age: At least 1 7 years old 
30 Weight: Minimum of 1 10 pounds 

Permanently Disqualified: 

History of hepatitis (after age 1 1) 
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Leukemia Lymphoma 

Human immunodeficiency virus (HIV), AIDS 
Chronic kidney disease 
Temporarily Disqualified: 
5 Pregnancy - until six weeks after delivery, miscarriage or abortion 

Major surgery or transfusions - for one year 
Mononucleosis - until complete recovery 
Prior whole blood donation - for eight weeks 

Antibiotics by injection for one week; by mouth, for forty-eight hours, 
10 except antibiotics for skin complexion; 

5 year Deferment: 

Internal cancer and skin cancer if it has been removed, is healed and 
there is no recurrence 
These correspond to blood bank catena for donating blood and represent a 
15 healthy population as defined herein for a human healthy database. 
Structure of the database 

Any suitable database structure and format known to those of skill in the 
art may be employed. For example, a relational database is a preferred format in 
which data is stored as matrices or tables of the parameters linked by an indexer 
20 that identifies each subject. Software for preparing and manipulating, including 
sorting the database, can be readily developed or adapted from commercially 
available software, such as Microsoft Access. 

Quality control 

Quality control procedures can be implemented. For example, after 
25 collection of samples, the quality of the collection in the bank can be assessed. 
For example, mix-up of samples can be checked by testing for known markers, 
such as sex. After samples are separated by ethnicity, samples are randomly 
tested for a marker associated with a particular ethnicity, such as HLA DQA1 
group specific component, to assess whether the samples have been properly 
30 sorted by ethnic group. An exemplary sample bank is depicted in Figure 4. 
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Obtaining genotypic data and other parameters for the database 

After informational and historical parameters are entered into the 
database, material from samples obtained from each subject, is analyzed. 
Analyzed material include proteins, metabolites, nucleic acids, lipids and any 
5 other desired constituent of the material. For example, nucleic acids, such as 
genomic DNA, can be analyzed by sequencing. 

Sequencing can be performed using any method known to those of skill in 
the art. For example, if a polymorphism is identified or known, and it is desired 
to assess its frequency or presence among the subjects in the database, the 
10 region of interest from each sample can be isolated, such as by PCR or 

restriction fragments, hybridization or other suitable method known to those of 
skill in the art and sequenced. For purposes herein, sequencing analysis is 
preferably effected using mass spectrometry {see, e.g., U.S. Patent Nos. 
5,547,835, 5,622,824, 5,851,765, and 5,928,906). Nucleic acids can also be 
15 sequence by hybridization {see, e.g., U.S. Patent Nos. 5,503,980, 5,631,134, 
5,795,714) and including analysis by mass spectrometry (see, U.S. application 
Serial Nos. 08/419,994 and 09/395,409). 

In other detection methods, it is necessary to first amplify prior to 
identifying the allelic variant. Amplification can be performed, e.g., by PCR 
20 and/or LCR, according to methods known in the art. In one embodiment, 

genomic DNA of a cell is exposed to two PCR primers and amplification for a 
number of cycles sufficient to produce the required amount of amplified DNA. In 
preferred embodiments, the primers are located between 1 50 and 350 base pairs 
apart. 

25 Alternative amplification methods include: self sustained sequence 

replication (Guatelli, J. C. et al., 1990, Proc. Natl. Acad. Sci. U.S.A. 
87:1874-1878), transcriptional amplification system (Kwoh, D. Y. et al., 1989, 
Proc. Natl. Acad. Sci. U.S.A. 86:1173-1177), Q-Beta Replicase (Lizardi, P. M. et 
aL, 1988, Bio/Technology 6:1197), or any other nucleic acid amplification 

30 method, followed by the detection of the amplified molecules using techniques 
well known to those of skill in the art. These detection schemes are especially 
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useful for the detection of nucleic acid molecules if such molecules are present 
in very low numbers. 

Nucleic acids can also be analyzed by detection methods and protocols, 
particularly those that rely on mass spectrometry (see, e.g., U.S. Patent No. 
5 5,605,798, 6,043,031, allowed copending U.S. application Serial No. 

08/744,481, U.S. application Serial No. 08/990,851 and International PCT 
application No. WO 99/31273, International PCT application No. WO 98/20019). 
These methods can be automated (see, e.g., copending U.S. application Serial 
No. 09/285,481 and published International PCT application No. 

10 PCT/USOO/081 1 1, which describes an automated process line). Preferred 

among the methods of analysis herein are those involving the primer oligo base 
extension (PROBE) reaction with mass spectrometry for detection (described 
herein and elsewhere, see e.g., U.S. Patent No. 6,043,031; see, also U.S. 
application Serial Nos. 09/287,681, 09/287,682, 09/287,141 and 09/287,679, 

15 allowed copending U.S. application Serial No. 08/744,481, International PCT 

application No. PCT/US97/20444, published as International PCT application No. 
WO 98/20019, and based upon U.S. application Serial Nos. 08/744,481, 
08/744,590, 08/746,036, 08/746,055, 08/786,988, 08/787,639, 08/933,792, 
08/746,055, 08/786,988 and 08/787,639; see, also U.S. application Serial No. 

20 09/074,936, U.S. Patent No. 6,024,925, and U.S. application Serial Nos. 

08/746,055 and 08/786,988, and published International PCT application No. 
WO 98/20020) 

A preferred format for performing the analyses is a chip based format in 
which the biopolymer is (inked to a solid support, such as a silicon or silicon- 

25 coated substrate, preferably in the form of an array. More preferably, when 
analyses are performed using mass spectrometry, particularly MALDI, small 
nanoliter volumes of sample are loaded on, such that the resulting spot is about, 
or smaller than, the size of the laser spot. It has been found that when this is 
achieved, the results from the mass spectrometric analysis are quantitative. The 

30 area under the signals in the resulting mass spectra are proportional to 

concentration (when normalized and corrected for background). Methods for 
preparing and using such chips are described in U.S. Patent No. 6,024,925, co- 
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pending U.S. application Serial Nos. 08/786,988, 09/364,774, 09/371 J 50 and 
09/297,575; see, also U.S. application Serial No. PCT/US97/201 95, which 
published as WO 98/20020. Chips and kits for performing these analyses are 
commercially available from SEQUENOM under the trademark MassARRAY. 
5 MassArray relies on the fidelity of the enzymatic primer extension reactions 
combined with the miniaturized array and MALDI-TOF (Matrix-Assisted Laser 
Desorption lonization-Time of Flight) mass spectrometry to deliver results rapidly. 
It accurately distinguishes single base changes in the size of DNA fragments 
associated with genetic variants without tags. 

10 The methods provided herein permit quantitative determination of alleles. 

The areas under the signals in the mass spectra can be used for quantitative 
determinations. The frequency is determined from the ratio of the signal to the 
total area of all of the spectrum and corrected for background. This is possible 
because of the PROBE technology as described in the above applications 

15 incorporated by reference herein. 

Additional methods of analyzing nucleic acids include amplification- based 
methods including polymerase chain reaction (PCR), ligase chain reaction (LCR), 
mini-PCR, rolling circle amplification, autocataiytic methods, such as those using 
QJ3 replicase, TAS, 3SR, and any other suitable method known to those of skill 

20 in the art. 

Other methods for analysis and identification and detection of 
polymorphisms, include but are not limited to, allele specific probes, Southern 
analyses, and other such analyses. 

The methods described below provide ways to fragment given amplified 

25 or non-amplified nucleotide sequences thereby producing a set of mass signals 
when mass spectrometry is used to analyze the fragment mixtures. 
Amplified fragments are yielded by standard polymerase chain methods (US 
4,683,195 and 4,683,202). The fragmentation method involves the use of 
enzymes that cleave single or double strands of DNA and enzymes that ligate 

30 DNA. The cleavage enzymes can be glycosylases, nickases, and site-specific 
and non site-specific nucleases with the most preferred enzymes being 
glycosylases, nickases, and site-specific nucleases. 
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Glycosylase Fragmentation Method 

DNA glycosylases specifically remove a certain type of nucleobase from a 
given DNA fragment. These enzymes can thereby produce abasic sites, which 
can be recognized either by another cleavage enzyme, cleaving the exposed 
5 phosphate backbone specifically at the abasic site and producing a set of 

nucleobase specific fragments indicative of the sequence, or by chemical means, 
such as alkaline solutions and or heat. The use of one combination of a DNA 
glycosylase and its targeted nucleotide would be sufficient to generate a base 
specific signature pattern of any given target region. 

10 Numerous DNA glcosylases are known, For example, a DNA glycosylase 

can be uracil-DNA glycolsylase (UDG) , 3-methyladenine DNA glycosylase, 3- 
methyladenine DNA glycosylase II, pyrimidine hydrate-DNA glycosylase, FaPy- 
DNA glycosylase, thymine mismatch-DNA glycosylase, hypoxanthine-DNA 
glycosylase, 5-Hydroxymethyiuracil DNA glycosylase (HmUDG), 5- 

15 Hydroxymethylcytosine DNA glycosylase, or 1 ,N6-ethenoadenine DNA 

glycosylase (see, e.g.,, U.S. Patent Nos. 5,536,649, 5,888, 795, 5,952,176 
and 6,099,553, International PCT application Nos. WO 97/03210, 
WO 99/54501; see, also, Eftedal et al. (1993) Nucleic Acids Res 21:2095-2101, 
Bjelland and Seeberg (1987) Nucleic Acids Res. 15:2787-2801, Saparbaev et af. 

20 (1995) Nucleic Acids Res. 23:3750-3755, Bessho (1999) Nucleic Acids Res. 
27:979-983) corresponding to the enzyme's modified nucleotide or nucleotide 
analog target. A preferred glycosylase is uracil-DNA glycolsylase (UDG). 

Uracil, for example, can be incorporated into an amplified DNA molecule 
by amplifying the DNA in the presence of normal DNA precursor nucleotides 

25 (e.g. dCTP, dATP, and dGTP) and dUTP. When the amplified product is treated 
with UDG, uracil residues are cleaved. Subsequent chemical treatment of the 
products from the UDG reaction results in the cleavage of the phosphate 
backbone and the generation of nucleobase specific fragments. Moreover, the 
separation of the complementary strands of the amplified product prior to 

30 glycosylase treatment allows complementary patterns of fragmentation to be 
generated. Thus, the use of dUTP and Uracil DNA glycosylase allows the 
generation of T specific fragments for the complementary strands, thus providing 
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information on the T as well as the A positions within a given sequence. Similar 
to this, a C-specific reaction on both (complementary) strands (i.e. with a C- 
specific glycosylase) yields information on C as well as G positions within a 
given sequence if the fragmentation patterns of both amplification strands are 
5 analyzed separately. Thus, with the glycosylase method and mass 

spectrometry, a full series of A, C, G and T specific fragmentation patterns can 
be analyzed. 

Nickase Fragmentation Method 

A DNA nickase, or DNase, can be used recognize and cleave one strand 
10 of a DNA duplex. Numerous nickases are known. Among these, for example, 
are nickase NY2A nickase and NYS1 nickase (Megabase) with the following 
cleavage sites: 

NY2A: 5'...RAG...3' 

3'...Y TC...5' where R = A or G and Y = C or T 
15 NYS1: 5'... CC[A/G/T]...3' 

3'... GGn7C/A]...5\ 
Fen-Ligase Fragmentation Method 

The Fen-ligase method involves two enzymes: Fen-1 enzyme and a ligase. 
The Fen-1 enzyme is a site-specific nuclease known as a "flap" endonuclease 
20 (US 5,843,669, 5,874,283, and 6,090,606). This enzymes recognizes and 

cleaves DNA "flaps" created by the overlap of two oligonucleotides hybridized to 
a target DNA strand. This cleavage is highly specific and can recognize single 
base pair mutations, permitting detection of a single homologue from an 
individual heterozygous at one SNP of interest and then genotyping that 
25 homologue at other SNPs occurring within the fragment, Fen-1 enzymes can be 
Fen-1 like nucleases e.g. human, murine, and Xenopus XPG enzymes and yeast 
RAD2 nucleases or Fen-1 endonucleases from, for example, M. jannaschii, P. 
furiosus, and P. woesei. Among preferred enzymes are the Fen-1 enzymes. 
The ligase enzyme forms a phosphodiester bond between two double 
30 stranded nucleic acid fragments. The ligase can be DNA Ligase I or DNA Ligase 
HI (see, e.g., U.S. Patent Nos. US 5,506,137, 5,700,672, 5,858,705 and 
5,976,806; see, also, Waga, era/. (1994) J. Biol. Chem. 269:10923-10934, Li 



'SDOCID <WO 0127SF7A? tA> 



WO 01/027857 



27 



PCT/USOO/28413 



et al. (1994) Nucleic Acids Res. 22:632-638, Arrand et al. (1986) J. Biol. Chem. 
261:9079-9082, Lehman (1974) Science 186:790-797, Higgins and Cozzarelli 
(1979) Methods Enzymol. 68:50-71, Lasko et al. (1990) Mutation Res. 
236:277-287, and Lindahl and Barnes (1992) Ann. Rev. Biochem. 61:251-281). 
5 Thermostable ligase (Epicenter Technologies), where "thermostable" 

denotes that the ligase retains activity even after exposure to temperatures 
necessary to separate two strands of DNA, are among preferred ligases for use 
herein. 

Type IIS Enzyme Fragmentation Method 

10 Restriction enzymes bind specifically to and cleave double-stranded DNA 

at specific sites within or adjacent to a particular recognition sequence. These 
enzymes have been classified into three groups (e.g. Types I, II, and III) as 
known to those of skill in the art. Because of the properties of type I and type III 
enzymes, they have not been widely used in molecular biological applications. 

15 Thus, for this invention type II enzymes are preferred. Of the thousands of 
restriction enzymes known in the arts, there are 179 different type II 
specificities. Of the 179 unique type II restriction endonucleases, 31 have a 4- 
base recognition sequence, 1 1 have a 5-base recognition sequence, 1 27 have a 
6-base recognition sequence, and 10 have recognition sequences of greater than 

20 six bases (US 5,604,098). Of category type IJ enzymes, type IIS is preferred. 

Type IIS enzymes can be Alw XI, Bbv I, Bee 83, Bpm I, Bsg I, Bsm Al, 
Bsm Fl, Bsa !, Bcc I, Beg I, Ear I, Eco 57I, Esp 31, Fau I, Fok I, Gsu I, Hga I, Mme 
I, Mbo II, Sap I, and the like. The preferred type US enzyme is Fok I. 

The Fok I enzyme endonuclease is an exemplary well characterized 

25 member of the Type IIS class (see, e.g., U.S. Patent Nos. 5,714,330, 

5,604,098, 5,436,150, 6,054,276 and 5,871,91 1; see, also, Szybalski et al. 
(1991) Gene 100:13-26, Wilson and Murray (1991) Ann. Rev. Genet. 25:585- 
627, Sugisaki et al. (1981) Gene 16:73-78, Podhajska and Szalski (1985) Gene 
40:175-182. Fok I recognizes the sequence 5'GGATG-3' and cleaves DNA 

30 accordingly. Type IIS restriction sites can be introduced into DNA targets by 
incorporating the site into primers used to amplify such targets. Fragments 
produced by digestion with Fok I are site specific and can be analyzed by mass 
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spectrometry methods such as MALDI-TOF mass spectrometry, ESI-TOF mass 
spectrometry, and any other type of mass spectrometry well known to those of 
skill in the art. 

Once a polymorphism has been found to correlatate with a parameter 
5 such as age. The possibility of false results due to allelic dropout is examined by 
doing comparative PCR in an adjacent region of the genome. 
Analyses 

In using the database, allelic frequencies can be determined across the 
population by analyzing each sample in the population individually, determining 

10 the presence or absence of allele or marker of interest in each individual sample, 
and then determining the frequency of the marker in the population. The 
database can then be sorted (stratified) to identify any correlations between the 
allele and a selected parameter using standard statistical analysis. If a 
correlation is observed, such as a decrease in a particular marker with age or 

15 correlation with sex or other parameter, then the marker is a candidate for 

further study, such as genetic mapping to identify a gene or pathway in which it 
is involved. The marker may then be correlated, for example, with a disease. 
Haplotying can also be carried out. Genetic mapping can be effected using 
standard methods and may also require use of databases of others, such as 

20 databases previously determined to be associated with a disorder. 

Exemplary analyses have been performed and these are shown in the 
figures, and discussed herein. 

Sample pooling 

It has been found that using the databases provided herein, or any other 
25 database of such information, substantially the same frequencies that were 
obtained by examining each sample separately can be obtained by pooling 
samples, such as in batches of 10, 20, 50, 100, 200, 500, 1000 or any other 
number. A precise number may be determined empirically if necessary, and can 
be as low as 3. 
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In one embodiment, the frequency of genotypic and other markers can be 
obtained by pooling samples. To do this a target population and a genetic 
variation to be assessed is selected, a plurality of samples of biopolymers are 
obtained from members of the population, and the biopolymer from which the 
5 marker or genotype can be inferred is determined or detected. A comparison of 
samples tested in pools and individually and the sorted results therefrom are 
shown in Figure 9, which shows frequency of the factor VII Allele 353Q. Figure 
10 depicts the frequency of the CETP Allele CETP in pooled versus individual 
samples. Figure 15 shows ethnic diversity among various ethnic groups in the 
10 database using pooled DNA samples to obtain the data. Figures 12-14 show 
mass spectra for these samples. 

Pooling of test samples has application not only to the healthy databases 
provided herein, but also to use in gathering data for entry into any database of 
subjects and genotypic information, including typical databases derived from 
15 diseased populations. What is demonstrated herein, is the finding that the 

results achieved are statistically the same as the results that would be achieved 
if each sample is analyzed separately. Analysis of pooled samples by a method, 
such as the mass spectrometric methods provided herein, permits resolution of 
such data and quantitation of the results. 
20 For factor VII the R53Q acid polymorphism was assessed, in Figure 9, 

the "individual" data represent allelic frequency observed in 92 individuals 
reactions. The pooled data represent the allelic frequency of the same 92 
individuals pooled into a single probe reaction. The concentration of DNA in the 
samples of individual donors is 250 nanograms. The total concentration of DNA 
25 in the pooled samples is also 250 nanograms, where the concentration of any 
individual DNA is 2.7 nanograms. 

It also was shown that it is possible to reduce the DNA concentration of 
individuals in a pooled samples from 2.7 nanograms to 0.27 nanograms without 
any change in the quality of the spectrum or the ability to quantitate the amount 
30 of sample detected. Hence low concentrations of sample may be used in the 
pooling methods. 
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Use of the databases and markers identified thereby 

The successful use of genomics requires a scientific hypothesis (i.e., 

common genetic variation, such as a SNP), a study design (i.e., complex 

disorders), samples and technology, such as the chip-based mass spectrometry 
5 analyses (see, e.g., U.S. Patent No. 5,605,798, U.S. Patent No. 5,777,324, 

U.S. Patent No. 6,043,031, allowed copending U.S. application Serial No. 

08/744,481, U.S. application Serial No. 08/990,851, International PCT 

application No. WO 98/20019, copending U.S. application Serial No. 

09/285,481, which describes an automated process line for analyses; see, also, 
10 U.S. application Serial Nos. 08/617,256, 09/287,681, 09/287,682, 09/287,141 

and 09/287,679, allowed copending U.S. application Serial No. 08/744,481, 

International PCT application No. PCT/US97/20444, published as International 

PCT application No. WO 98/20019, and based upon U.S. application Serial Nos. 

08/744,481, 08/744,590, 08/746,036, 08/746,055, 08/786,988, 08/787,639, 
15 08/933,792, 08/746,055, 09/266,409, 08/786,988 and 08/787,639; see, also 

U.S. application Serial No. 09/074,936). All of these aspects can be used in 

conjunction with the databases provided herein and samples in the collection. 

The databases and markers identified thereby can be used, for example, 

for identification of previously unidentified or unknown genetic markers and to 
20 identify new uses for known markers. As markers are identified, these may be 

entered into the database to use as sorting parameters from which additional 

correlations may be determined. 

Previously unidentified or unknown genetic markers 
The samples in the healthy databases can be used to identify new 
25 polymorphisms and genetic markers, using any mapping, sequencing, 

amplification and other methodologies, and in looking for polymorphisms among 
the population in the database. The thus-identified polymorphism can then be 

entered into the database for each sample, and the database sorted (stratified) 

using that polymorphism as a sorting parameter to identify any patterns and 
30 correlations that emerge, such as age correlated changes in the frequency of the 

identified marker. If a correlation is identified, the locus of the marker can be 

mapped and its function or effect assessed or deduced. 
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Thus, the databases here provide means for: 

identification of significantly different allelic frequencies of genetic factors 
by comparing the occurrence or disappearance of the markers with increasing 
age in population and then associating the markers with a disease or a 
5 biochemical pathway; 

identification of significantly different allelic frequencies of disease 
causing genetic factors by comparing the male with the female population or 
comparing other selected stratified populations and associating the markers with 
a disease or a biochemical pathway; 
10 identification of significantly different allelic frequencies of disease 

causing genetic factors by comparing different ethnic groups and associating the 
markers with a disease or a biochemical pathway that is known to occur in high 
frequency in the ethnic group; 

profiling potentially functional variants of genes through the general 
15 panmixed population stratified according to age, sex, and ethnic origin and 
thereby demonstrating the contribution of the variant genes to the physical 
condition of the investigated population; 

identification of functionally relevant gene variants by gene disequilibrium 
analysis performed within the general panmixed population stratified according 
20 to age, sex, and ethnic origin and thereby demonstrating their contribution to the 
physical condition of investigated population; 

identification of potentially functional variants of chromosomes or parts of 
chromosomes by linkage disequilibrium analysis performed within the general 
panmixed population stratified according to age, sex, and ethnic origin and 
25 thereby demonstrating their contribution to the physical condition of investigated 
population. 

Uses of the identified markers and known markers 

The databases may also be used in conjunction with known markers and 
sorted to identify any correlations. For example, the databases can be used for: 
30 determination and evaluation of the penetrance of medically relevant 

polymorphic markers; 
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determination and evaluation of the diagnostic specificity of medically 
relevant genetic factors; 

determination and evaluation of the positive predictive value of medically 
relevant genetic factors; 
5 determination and evaluation of the onset of complex diseases, such as, 

but are not limited to, diabetes, hypertension, autoimmune diseases, 
arteriosclerosis, cancer and other diseases within the general population with 
respect to their causative genetic factors; 

delineation of the appropriate strategies for preventive disease treatment; 
10 delineation of appropriate timelines for primary disease intervention; 

validation of medically relevant genetic factors identified in isolated 
populations regarding their general applicability; 

validation of disease pathways including all potential target structures 
identified in isolated populations regarding their general applicability; and 
15 validation of appropriate drug targets identified in isolated populations 

regarding their general applicability. 

Among the diseases and disorders for which polymorphisms may be 
linked include, those linked to inborn errors of metabolism, acquired metabolic 
disorders, intermediary metabolism, oncogenesis pathways, blood clotting 
20 pathways, and DNA synthetic and repair pathways DNA 

repair/replication/transcription factors and activities, e.g., such as genes related 
to oncogenesis, aging and genes involved in blood clotting and the related 
biochemical pathways that are related to thrombosis, embolism, stroke, 
myocardial infarction, angiogenesis and oncogenesis. 
25 For example, a number of diseases are caused by or involve deficient or 

defective enzymes in intermediary metabolism (see, e.g. . Tables 1 and 2, below) 
that result, upon ingestion of the enzyme substrates, in accumulation of harmful 
metabolites that damage organs and tissues, particularly an infant's developing 
brain and other organs, resulting in mental retardation and other developmental 
30 disorders. 
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Identification of markers and genes for such disorders is of great interest. 
Model systems 

Several gene systems, p21, p53 and Lipoprotein Lipase polymorphism 
(N291S), were selected. The p53 gene is a tumor suppressor gene that is 
5 mutated in diverse tumor types. One common allelic variant occurs at codon 72. 
A polymorphism that has been identified in the p53 gene, i.e., the R72P allele, 
results in an amino acid exchange, arginine to proline, at codon 72 of the gene. 

Using diseased populations, it has been shown that there are ethnic 
differences in the allelic distribution of these alleles among African-Americans 
10 and Caucasians in the U.S. The results here support this finding and also 

demonstrate that the results obtained with a healthy database are meaningful 
(sec. Figure 7B). 

The 29 IS allele leads to reduced levels of high density lipoprotein 
cholesterol (HDL-C) that is associated with an increased risk of males for 
15 arteriosclerosis and in particular myocardial infarction (see, Reymer era/. (1995) 
Nature Genetics 70:28-34). 

Both genetic polymorphisms were profiled within a part of the Caucasian 
population-based sample bank. For the polymorphism located in the lipoprotein 
lipase gene a total of 1025 unselected individuals (436 males and 589 females) 
20 were tested. Genomic DNA was isolated from blood samples obtained from the 
individuals. 

As shown in the Examples and figures, an exemplary database containing 
about 5000 subjects, answers to the questionnaire (see Figure 3), and genotypic 
information has been stratified. A particular known allele has been selected, and 

25 the samples tested for the marker using mass spectrometric analyses, 

particularly PROBE (see the EXAMPLES) to identify polymorphisms in each 
sample. The population in the database has been sorted according to various 
parameters and correlations have been observed. For example, FIGURES 2A-C, 
show sorting of the data by age and sex for the Lipoprotein Lipase gene in the 

30 Caucasian population in the database. The results show a decrease in the 

frequency of the allele with age in males but no such decrease in females. Other 
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alleles that have been tested against the database, include, alleles of p53, p21 
and factor VII. Results when sorted by age are shown in the figures. 

These examples demonstrate an effect of altered frequency of disease 
causing genetic factors within the general population. The scientific 
5 interpretation of those results allows prediction of medical relevance of 

polymorphic genetic alterations. In addition, conclusions can be drawn with 
regard to their penetrance, diagnostic specificity, positive predictive value, onset 
of disease, most appropriate onset of preventive strategies, and the general 
applicability of genetic alterations identified in isolated populations to panmixed 
10 populations. 

Therefore, an age- and sex-stratified population-based sample bank that is 
ethnically homogenous is a suitable tool for rapid identification and validation of 
genetic factors regarding their potential medical utility. 

Exemplary computer system for creating, storing and processing the databases 
15 Systems 

Systems, including computers, containing the databases are provided 
herein. The computers and databases can be used in conjunction, for example, 
with the APL system (see, copending U.S. application Serial No. 09/285,481), 

20 which is an automated system for analyzing biopolymers, particularly nucleic 
acids. Results from the APL system can be entered into the database. 

Any suitable computer system may be used. The computer system may 
be integrated into systems for sample analysis, such as the automated process 
line described herein (see, e.g., copending U.S. application Serial No. 

25 09/285,481). 

Figure 1 7 is a block diagram of a computer constructed in to provide and 
process the databases described herein. The processing that maintains the 
database and performs the methods and procedures may be performed on 
multiple computers all having a similar construction, or may be performed by a 

30 single, integrated computer. For example, the computer through which data is 
added to the database may be separate from the computer through which the 
database is sorted, or may be integrated with it. In either arrangement, the 
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computers performing the processing may have a construction as illustrated in 
Figure 17. 

Figure 17 is a block diagram of an exemplary computer 1700 that 
maintains the database described above and performs the methods and 
5 procedures. Each computer 1 700 operates under control of a central processor 
unit (CPU) 1702, such as a "Pentium" microprocessor and associated integrated 
circuit chips, available from Intel Corporation of Santa Clara, California, USA. A 
computer user can input commands and data from a keyboard and display 
mouse 1704 and can view inputs and computer output at a display 1706. The 

10 display is typically a video monitor or flat panel display device. The computer 
1700 also includes a direct access storage device (DASD) 1707, such as a fixed 
hard disk drive. The memory 1708 typically comprises volatile semiconductor 
random access memory (RAM). Each computer preferably includes a program 
product reader 1710 that accepts a program product srorage device 171 2, from 

15 which the program product reader can read data (and to which it can optionally 
write data). The program product reader can comprise, for example, a disk 
drive, and the program product storage device can comprise removable storage 
media such as a magnetic floppy disk, an optical CD-ROM disc, a CD-R disc, a 
CD-RW disc, or a DVD data disc. If desired, the computers can be connected so 

20 they can communicate with each other, and with other connected computers, 
over a network 1713. Each computer 1700 can communicate with the other 
connected computers over the network 1713 through a network interface 1714 
that enables communication over a connection 1716 between the network and 
the computer. 

25 The computer 1700 operates under control of programming steps that are 

temporarily stored in the memory 1 708 in accordance with conventional 
computer construction. When the programming steps are executed by the CPU 
1702, the pertinent system components perform their respective functions. 
Thus, the programming steps implement the functionality of the system as 

30 described above. The programming steps can be received from the DASD 1707, 
through the program product reader 1712, or through the network connection 
1716. The storage drive 1710 can receive a program product, read 
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programming steps recorded thereon and transfer the programming steps into 
the memory 1708 for execution by the CPU 1702. As noted above, the 
program product storage device 1710 can comprise any one of multiple 
removable media having recorded computer-readable instructions, including 
5 magnetic floppy disks and CD-ROM storage discs. Other suitable program 

product storage devices can include magnetic tape and semiconductor memory 
chips. In this way, the processing steps necessary for operation can be 
embodied on a program product. 

Alternatively, the program steps can be received into the operating 

10 memory 1708 over the network 1713. In the network method, the computer 
receives data including program steps into the memory 1 708 through the 
network interface 1714 after network communication has been established over 
the network connection 1716 by well-known methods that will be understood by 
those skilled in the art without further explanation. The program steps are then 

15 executed by the CPU 1702 to implement the processing of the Garment 
Oatabase system. 

It should be understood that all of the computers of the system preferably 
have a construction similar to that shown in Figure 17, so that details described 
with respect to the Figure 1 7 computer 1 700 will be understood to apply to all 

20 computers of the system 1700. This is indicated by multiple computers 1700 
shown connected to the network 1713. Any one of the computers 1700 can 
have an alternative construction, so long as they can communicate with the 
other computers and support the functionality described herein. 

Figure 1 8 is a flow diagram that illustrates the processing steps 

25 performed using the computer illustrated in Figure 17, to maintain and provide 

access to the databases, such as for identifying polymorphic genetic markers. In 
particular, the information contained in the database is stored in computers 
having a construction similar to that illustrated in Figure 17. The first step for 
maintaining the database, as indicated in Figure 18, is to identify healthy 

30 members of a population. As noted above, the population members are subjects 
that are selected only on the basis of being healthy, and where the subjects are 
mammals, such as humans, they are preferably selected bas d upon apparent 
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health and the absence of detectable infections. The step of identifying is 
represented by the flow diagram box numbered 1802. 

The next step, represented by the flow diagram box numbered 1804, is 
to obtain identifying and historical information and data relating to the identified 
5 members of the population. The information and data comprise parameters for 
each of the population members, such as member age, ethnicity, sex, medical 
history, and ultimately genotypic information. Initially, the parameter information 
is obtained from a questionnaire answered by each member, from whom a body 
tissue or body fluid sample also is obtained. The step of entering and storing 

10 these parameters into the database of the computer is represented by the flow 
diagram box numbered 1806. As additional information about each population 
member and corresponding sample is obtained, this information can be inputted 
into the database and can serve as a sorting parameter. 

In the next step, represented by the flow diagram box numbered 1808, 

15 the parameters of the members are associated with an indexer. This step may 
be executed as part of the database storage operation, such as when a new data 
record is stored according to the relations) database structure and is 
automatically linked with other records according to that structure. The step 
1806 also may be executed as part of a conventional data sorting or retrieval 

20 process, in which the database entries are searched according to an input search 
or indexing key value to determine attributes of the data. For example, such 
search and sort techniques may be used to follow the occurrence of known 
genetic markers and then determine if there is a correlation with diseases for 
which they have been implicated. Examples of this use are for assessing the 

25 frequencies of the p53 and Lipoprotein Lipase polymorphisms. 

Such searching of the database also may be valuable for identifying one 
or more genetic markers whose frequency changes within the population as a 
function of age, ethnic group, sex, or some other criteria. This can allow the 
identification of previously unknown polymorphisms and, ultimately, 

30 identification of a gene or pathway involved in the onset and progression of 
disease. 
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In addition, the database can be used for taking an identified 
polymorphism and ascertaining whether it changes in frequency when the data i 
sorted according to a selected parameter. 

In this way, the databases and methods provided herein permit, among 
5 other things, identification of components, particularly key components, of a 
disease process by understanding its genetic underpinnings, and also an 
understanding of processes, such as individual drug responses. The databases 
and methods provided herein also can be used in methods involving elucidation 
of pathological pathways, in developing new diagnostic assays, identifying new 
10 potential drug targets, and in identifying new drug candidates. 
Morbidity and/or early mortality associated polymorphisms 

A database containing information provided by a population of healthy 
blood donors who were not selected for any particular disease to can be used to 
identify polymorphisms and the alleles in which they are present, whose 
15 frequency decreases with age. These may represent morbidity susceptibility 
markers and genes. 

Polymorphisms of the genome can lead to altered gene function, protein 
function or genome instability. To identify those polymorphisms which have a 
clinical relevance/utility is the goal of a world-wide scientific effort. It can be 
20 expected that the discovery of such polymorphisms will have a fundamental 

impact on the identification and development of novel drug compounds to cure 
diseases. However, the strategy to identify valuable polymorphisms is 
cumbersome and dependent upon the availability of many large patient and 
control cohorts to show disease association. In particular, genes that cause a 
25 general risk of the population to suffer from any disease {morbidity susceptibility 
genes) will escape these case/control studies entirely. 

Here described is a screening strategy to identify morbidity susceptibility 
genes underlying a variety of different diseases. The definition of a morbidity 
susceptibility gene is a gene that is expressed in many different cell types or 
30 tissues (housekeeping gene) and its altered function can facilitate the expression 
of a clinical phenotype caused by disease-specific susceptibility genes that are 
involved in a pathway specific for this disorder. In other words, these morbidity 
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susceptibility genes predispose people to develop a distinct disease according to 
their genetic make-up for this disease. 

Candidates for morbidity susceptibility genes can be found at the bottom 
level of pathways involving transcription, translation, heat-shock proteins, 
5 protein trafficking, DMA repair, assembly systems for subcellular structures (e.g. 
mitochondria, peroxysomes and other cellular microbodies), receptor signaling 
cascades, immunology, etc. Those pathways control the quality of life at the 
cellular level as well as for the entire organism. Mutations/polymorphisms 
located in genes encoding proteins for those pathways can reduce the fitness of 
10 cells and make the organism more susceptible to express the clinical phenotype 
caused by the action of a disease-specific susceptibility gene. Therefore, these 
morbidity susceptibility genes can be potentially involved in a whole variety of 
different complex diseases if not in all. Disease-specific susceptibility genes are 
involved in pathways that can be considered as disease-specific pathways like 
15 glucose-, lipid, hormone metabolism, etc. 

The exemplified method permit, among other things, identification of 
genes and/or gene products involved in a man's general susceptibility to 
morbidity and/or mortality; use of these genes and/or gene products in studies to 
elucidate the genetic underpinnings of human diseases; use of these genes 
20 and/or gene products in combinatorial statistical analyses without or together 
with disease-specific susceptibility genes; use of these genes and/or gene 
products to predict penetrance of disease susceptibility genes; use of these 
genes and/or gene products in predisposition and/or acute medical diagnostics 
and use of these genes and/or gene products to develop drugs to cure diseases 
25 and/or to extend the life span of humans. 
SCREENING PROCESS 

The healthy population stratified by age, gender and ethnicity, etc. is a 
very efficient and a universal screening tool for morbidity associated genes. 
Changes of allelic frequencies in the young compared to the old population are 
30 expected to indicate putative morbidity susceptibility genes. Individual samples 
of this healthy population base can be pooled to further increase the throughput. 
In a proof of principle experiment pools of young and old Caucasian females and 
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males were applied to screen more than 400 randomly chosen single nucleotide 
polymorphisms located in many different genes. Candidate polymorphisms were 
identified if the allelic difference was greater than 8% between young and old for 
both or only one of the genders. The initial results were assayed again in at 
5 least one independent subsequent experiments. Repeated experiments are 
necessary to recognize unstable biochemical reactions, which occur with a 
frequency of about 2-3% and can mimic age-related allelic frequency 
differences. Average frequency differences and standard deviations are 
calculated after successful reproducibility of initial results. The final allelic 

10 frequency is then compared to a reference population of Caucasian CEPH sample 
pool. The result should show similar allelic frequencies in the young Caucasian 
population. Subsequently, the exact allele frequencies of candidates including 
genotype information were obtained by analyzing all individual samples. This 
procedure is straight forward with regard to time and cost. It enables the 

15 screening of an enormous number of SNPs. So far, several markers with a 
highly significant association to age were identified and described below. 

In general at least 5 individuals in a stratified population need to be 
screened to produce statistically significant results. The frequency of the allele 
is determined for an age stratified population. Chi square analysis is then 

20 performed on the allelic frequencies to determine if the difference between age 
groups is statistically significant. A p value less than of 0.1 is considered to 
represent a statistically significant difference. More preferably the p value 
should be less than 0.05. 
Clinical Trials 

25 The identification of markers whose frequency in a population decreases 

with age also allows for better designed and balanced clinical trials. Currently, if 
a clinical trial utilizes a marker as a significant endpoint in a study and the 
marker disappears with age, then the results of the study may be inaccurate. By 
using methods provided herein, it can be ascertained that if a marker decreases 

30 in frequency with age. This information can be considered and controlled when 
designing the study. For, example, an age ind pend nt mark r could b 
substituted in its place. 



.'9DOCID <WO r *l27P?7A' a IA> 



RECTIFIED SHEET (RULE 91) ISA/EP 



WO 01/027857 



41 



PCT/USOO/28413 



The following examples are included for illustrative purposes only and are 
not intended to limit the scope of the invention. 

EXAMPLE 1 

This example describes the use of a database containing information 
5 provided by a population of healthy blood donors who were not selected for any 
particular disease to determine the distribution of allelic frequencies of known 
genetic markers with age and by sex in a Caucasian subpopulation of the 
database. The results described in this example demonstrate that a disease- 
related genetic marker or polymorphism can be identified by sorting a healthy 
10 database by a parameter or parameters, such as age, sex and ethnicity. 
Generating a database 

Blood was obtained by venous puncture from human subjects who met 
blood bank criteria for donating blood. The blood samples were preserved with 
EDTA at pH 8.0 and labeled. Each donor provided information such as age, sex, 

15 ethnicity, medical history and family medical history. Each sample was labeled 
with a barcode representing identifying information. A database was generated 
by entering, for each donor, the subject identifier and information corresponding 
to that subject into the memory of a computer storage medium using 
commercially available software, e.g., Microsoft Access. 

20 Model genetic markers 

The frequencies of polymorphisms known to be associated at some level 
with disease were determined in a subpopulation of the subjects represented in 
the database. These known polymporphisms occur in the p21, p53 and 
Lipoprotein Lipase genes. Specifically, the N291S polymorphism (N291S) of the 

25 Lipoprotein Lipase gene, which results in a substitution of a serine for an 

asparagine at amino acid codon 291, leads to reduced levels of high density 
lipoprotein cholesterol (HDL-C) that is associated with an increased risk of males 
for arteriosclerosis and in particular myocardial infarction (see, Reymer et aL 
(1995) Nature Genetics 70:28-34). 

30 The p53 gene encodes a cell cycle control protein that assesses DNA 

damage and acts as a transcription factor regulating genes that control cell 
growth, DNA repair and apoptosis (programmed cell death). Mutations in the 
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p53 gene have been found in a wide variety of different cancers, including 
different types of leukemia, with varying frequency. The loss of normal p53 
function results in genomic instability an uncontrolled cell growth. A 
polymorphism that has been identified in the p53 gene, i.e., the R72P allele, 
5 results in the substitution of a proline for an arginine at amino acid codon 72 of 
the gene. 

The p21 gene encodes a cycJin-dependent kinase inhibitor associated 
with G1 phase arrest of normal cells. Expression of the p21 gene triggers 
apoptosis. Polymorphisms of the p21 gene have been associated with Wilms' 

10 tumor, a pediatric kidney cancer. One polymorphism of the p21 gene, the S31R 
polymorphism, results in a substitution of an arginine for a serine at amino acid 
codon 31. 

Database analysis 

Sorting of subjects according to specific parameters 

15 The genetic polymorphisms were profiled within segments of the 

Caucasian subpopulation of the sample bank. For p53 profiling, the genomic 
DNA isolated from blood from a total of 1277 Caucasian subjects age 18-59 
years and 457 Caucasian subjects age 60-79 years was analyzed. For p21 
profiling, the genomic DNA isolated from blood from a total of 910 Caucasian 

20 subjects age 1 8-49 years and 824 Caucasian subjects age 50-79 years was 

analyzed. For lipoprotein lipase gene profiling, the genomic DNA from a total of 
1464 Caucasian females and 1470 Caucasian males under 60 years of age and 
a total of 478 Caucasian females and 560 Caucasian males over 60 years of age 
was analyzed. 

25 Isolation and analysis of genomic DNA 

Genomic DNA was isolated from blood samples obtained from the 
individuals. Ten milliliters of whole blood from each individual was centrifuged 
at 2000 x g. One milliliter of the buffy coat was added to 9 ml of 155 mM 
NH 4 CI, 10 mM KHC0 3 , and 0.1 mM Na 2 EDTA, incubated 10 min at room 
30 t mperature and centrifuged for 10 min at 2000 x g. The supernatant was 
removed, and the white cell pellet was washed in 155 mM NH 4 CI, 10 mM 
KHCO3 and 0.1 mM Na 2 EDTA and resuspended in 4.5 ml of 50 mM Tris, 5 mM 
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EDTA and 1% SDS. Proteins were precipitated from the cell lysate by 6 mM 
ammonium acetate, pH 7.3, and then separated from the nucleic acids by 
centrifugation at 3000 x g. The nucleic acid was recovered from the 
supernatant by the addition of an equal volume of 100% isopropanol and 
5 centrifugation at 2000 x g. The dried nucleic acid pellet was hydrated in 10 mM 
Tris, pH 7.6, and 1 mM Na 2 EDTA and stored at 4° C. 

Assays of the genomic DNA to determine the presence or absence of the 
known genetic markers were developed using the BiomassPROBE™ detection 
method (primer oligo base extension) reaction. This method uses a single 
10 detection primer followed by an oligonucleotide extension step to give products, 
which can be readily resolved by mass spectrometry, and, in particular, MALDI- 
TOF mass spectrometry. The products differ in length depending on the 
presence or absence of a polymorphism. In this method, a detection primer 
anneals adjacent to the site of a variable nucleotide or sequence of nucleotides 

15 and the primer is extended using a DNA polymerase in the presence of one or 
more dideoxyNTPs and, optionally, one or more deoxyNTPs. The resulting 
products are resolved by MALDI-TOF mass spectrometry. The mass of the 
products as measured by MALDI-TOF mass spectrometry makes possible the 
determination of the nucleotide(s) present at the variable site. 

20 First, each of the Caucasian genomic DNA samples was subjected to 

nucleic acid amplification using primers corresponding to sites 5' and 3' of the 
polymorphic sites of the p21 {S31R allele), p53 (R72P allele) and Lipoprotein 
Lipase (N291S allele) genes. One primer in each primer pair was biotinylated to 
permit immobilization of the amplification product to a solid support. 

25 Specifically, the polymerase chain reaction primers used for amplification of the 
relevant segments of the p21, p53 and lipoprotein lipase genes are shown 
below: US4p21c31-2F (SEQ ID NO: 9) and US5p21-2R (SEQ ID NO: 10) for p21 
gene amplification; US4-p53-ex4-F (also shown as p53-ex4US4 (SEQ ID NO: 2)) 
and US5-p53/2-4R (also shown as US5P53/4R (SEQ ID NO: 3)) for p53 gene 

30 amplification; and US4-LPL-F2 (SEQ ID NO: 16) and US5-LPL-R2 (SEQ ID NO: 
17) for lipoprotein lipase gene amplification. 
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Amplification of the respective DNA sequences was conducted according 
to standard protocols. For example, primers may be used in a concentration of 8 
pmol. The reaction mixture (e.g., total volume 50 may contain 
Taq-polymerase including 10x buffer and dTNPs. Cycling conditions for 
5 polymerase chain reaction amplification may typically be initially 5 min. at 95 °C, 
followed by 1 min. at 94 °C, 45 sec at 53 °C, and 30 sec at 72 °C for 40 cycles 
with a final extension time of 5 min at 72 °C. Amplification products may be 
purified by using Qiagen's PCR purification kit (No. 28106) according to 
manufacturer's instructions. The elution of the purified products from the 

10 column can be done in 50 p\ TE-buffer (10mM Tris, 1 mM EDTA, pH 7.5). 

The purified amplification products were immobilized via a biotin-avidin 
linkage to streptavidin-coated beads and the double-stranded DNA was 
denatured. A detection primer was then annealed to the immobilized DNA using 
conditions such as, for example, the following: 50 //I annealing buffer (20 mM 

15 Tris, 10 mM KCI, 10 mM (NH 4 ) 2 SO d , 2 mM MgS0 2 , 1% Triton X-100, pH 8) at 
50°C for 10 min, followed by washing of the beads three times with 200 //I 
washing buffer (40 mM Tris, 1 mM EDTA, 50 mM NaCI, 0.1% Tween 20, pH 
8.8) and once in 200 //I TE buffer. 

The PROBE extension reaction was performed, for example, by using 

20 some components of the DNA sequencing kit from USB (No. 70770) and dNTPs 
or ddNTPs from Pharmacia. An exemplary protocol could include a total reaction 
volume of 45 pi, containing of 21 jj\ water, 6 //I Sequenase-buffer, 3 jj\ 10 mM 
DTT solution, 4.5 //I, 0.5 mM of three dNTPs, 4.5 //I, 2 mM the missing one 
ddNTP, 5.5 jj\ glycerol enzyme dilution buffer, 0.25 jj\ Sequenase 2.0, and 0.25 

25 pyrophosphatase. The reaction can then by pipetted on ice and incubated for 1 5 
min at room temperature and for 5 min at 37 °C. The beads may be washed 
three times with 200 p\ washing buffer and once with 60 //l of a 70 mM 
NH 4 -Citrate solution. 

The DNA was denatured to release the extended primers from the 

30 immobilized template. Each of the resulting extension products was separately 
analyzed by MALDI-TOF mass spectrometry using 3-hydroxypicolinic acid (3- 
HPA) as matrix and a UV laser. 
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Specifically, the primers used in the PROBE reactions are as shown 
below: P21/31-3 (SEQ ID NO: 12) for PROBE analysis of the p21 polymorphic 
site; P53/72 (SEQ ID NO: 4) for PROBE analysis of the p53 polymorphic site; 
and LPL-2 for PROBE analysis of the lipoprotein lipase gene polymorphic site. In 
5 the PROBE analysis of the p21 polymorphic site, the extension reaction was 

performed using dideoxy-C. The products resulting from the reaction conducted 
on a "wild-type" allele template (wherein codon 31 encodes a serine) and from 
the reaction conducted on a polymorphic S31R allele template (wherein codon 
31 encodes an arginine) are shown below and designated as P2 1/3 1-3 Ser (wt) 

10 (SEQ ID NO: 13) and P21/31-3 Arg (SEQ ID NO: 14), respectively. The masses 
for each product as can be measured by MALDI-TOF mass spectrometry are also 
provided (i.e., 4900.2 Da for the wild-type product and 5213.4 Da for the 
polymorphic product). 

In the PROBE analysis of the p53 polymorphic site, the extension reaction 

15 was performed using dideoxy-C. The products resulting from the reaction 
conducted on a "wild-type" allele template (wherein codon 72 encodes an 
arginine) and from the reaction conducted on a polymorphic R72P allele template 
(wherein codon 72 encodes a proline) are shown below and designated as 
Cod72 G Arg (wt) and Cod72 C Pro, respectively. The masses for each product 

20 as can be measured by MALDI-TOF mass spectrometry are also provided (i.e., 
5734.8 Da for the wild-type product and 5405.6 Da for the polymorphic 
product). 

In the PROBE analysis of the lipoprotein lipase gene polymorphic site, the 
extension reaction was performed using a mixture of ddA and ddT. The 

25 products resulting from the reaction conducted on a "wild-type" allele template 
(wherein codon 291 encodes an asparagine) and from the reaction conducted on 
a polymorphic N291S allele template (wherein codon 291 encodes a serine) are 
shown below and designated as 291Asn and 291 Ser, respectively. The masses 
for each product as can be measured by MALDI-TOF mass spectrometry are also 

30 provided (i.e., 6438.2 Da for the wild-type product and 6758.4 Da for the 
polymorphic product). 
P53-1 (R72P) 
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PCR Product length: 407 bp (SEQ ID NO: 1) 

US4-p53-ex4-F 
ctg aggacctggt cctctgactg 
ctcttttcac ccatcta cag tcccccttac ca tcccaaac aat ggatgat ttaatgctot 
5 ccccggacga tattgaacaa tggttcactg aagacccagg tccagatgaa gctcccagaa 
P53/72 72R 
tgccagaggc tgc tccccg c gtggcccctg caccagcagc tcctacaccg gcggcccctg 

c 72P 

caccagcccc ctcctggccc ctgtcatctt ctgtcccttc ccagaaaacc taccagggca 
10 gctacggttt ccgtctgggc ttcttgcatt ctgggacagc caagtctgtg acttgcacgg 
tcagttgccc tgaggggctg gcttccatga gacttcaa 

US5-p53/2-4R 

Primers (SEQ ID NOs: 2-4) 

p53-ex4FUS4 ccc aqt cac gac qtt qta aaa co c tga gga cct ggt cct ctg ac 
1 5 US5P53/4R age gga taa caa ttt cac aca gg t tga agt etc atg gaa gec 
P53/72 gec aga ggc tgc tec cc 



Masses 



Allele 


Product Termination: ddC 


SEQ # 


Length 


Mass 


P53/72 


gccagaggctgctcccc 


5 


17 


5132.4 


Cod72 G Arg (wt) 


gccagaggctgctccccgc 


6 


19 


5734.8 


Cod72 C Pro 


gccagaggctgctccccc 


7 


18 


5405.6 



Biotinylated US5 primer is used in the PCR amplification. 
LPL-1 (IM291S) 



25 Amino acid exchange asparagine to serine at codon 291 of the 

lipoprotein lipase gene. 

PCR Product length: 251 bp (SEQ ID NO: 15) 

US4-LPL-F2 (SEQ ID NO: 1 6) 

gcgctccatt catctcttca tcgactctct gttgaatgaa gaaaatccaa gtaaggecta 
30 caggtgcagt tccaaggaag cctttgagaa agggctctgc ttgagttgta gaaagaaccg 
IjPL-2 2 9 IN 

ctcrcaacaat ctg ggctatg agatca ataa agtcagagee aaaagaagca gcaaaatgta 

g 291S 

cctgaagact cgttctcaga tgece 
35 US4-LPL-R2 

Primers (SEQ ID NOs: 16-18): 

US4-LPL-F2 ccc agt cac gac qtt qta aaa co o cgc tec att cat etc ttc 
US5-LPL-R2 age gga taa caa ttt cac aca ag o ggc ate tga gaa cga gtc 
LPL-2 caa tct ggg eta tga gat ca 
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Masses 



Allele 


Product Termination: ddA, ddT 


SEQ # 


Length 


Mass 


LPL-2 


caatctgggctatgagatca 


19 


20 


6141 


291 Asn 


caatctgggctatgagatcaa 


20 


21 


6438.2 


291 Ser 


caatctgggctatgagatcagt 


21 


22 


6758.4 



10 



15 



20 



Biotinyiated US5 primer is used in the PCR amplification. 
P21-1 (S31R) 

Amino acid exchange serine to arginine at codon 31 of the tumor 
suppressor gene p21 . Product length: 207 bp (SEQ ID NO: 8) 

US4p21c31-2F 

gfccc gtcagaaccc atgcggcagc 
p21/31-3 31S 

aaggcctgcc gccgcctctt cggcccagtg qa cagcgagc agctgaq ccq cgactgtgat 

a 31R 

gcgctaatgg cgggctgcat ccaggaggcc cgtgagcgat ggc.acttcga ctttgtcacc 
gagacaccac tggaggg 
US5p21-2R 

Primers (SEQ ID NOs: 9-11) 

US4p21c31-2F ccc agt cac gac gtt gta aaa eg g tec gtc aga acc cat gcg g 
US5p21-2R age gga taa caa ttt cac aca gg c tec agt ggt gtc teg gtg ac 
P2 1/3 1-3 cag cga gca get gag 

Masses 



Allele 


Product Termination: ddC 


SEQ # 


Length 


Mass 


p21/31-3 


cagcgagcagctgag 


12 


15 


4627 


P2 1/3 1-3 Ser (wt) 


cagcgagcagctgagc 


13 


16 


4900.2 


P21/31-3 Arg 


cagcgagcagctgagac 


14 


17 


5213.4 



25 



Biotinyiated US5 primer is used in the PCR amplification. 

30 Each of the Caucasian subject DNA samples was individually 

analyzed by MALDI-TOF mass spectrometry to determine the identity of 
the nucleotide at the polymorphic sites. The genotypic results of each 
assay can be entered into the database. The results were then sorted 
according to age and/or sex to determine the distribution of allelic 

35 frequencies by age and/or sex. As depicted in the Figures showing 
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histograms of the results, in each case, there was a differential 
distribution of the allelic frequencies of the genetic markers for the p21, 
p53 and lipoprotein lipase gene polymorphisms. 

Figure 8 shows the results of the p21 genetic marker assays 
5 reveals a statistically significant decrease (from 13.3% to 9.2%) in the 
frequency of the heterozygous genotype (S31R) in Caucasians with age 
(18-49 years of age compared to 50-79 years of age). The frequencies of 
the homozygous (S31 and R31) genotypes for the two age groups are 
also shown, as are the overall frequencies of the S31 and R31 alleles in 
10 the two age groups (designated as *S31 and *R31, respectively in the 
Figure). 

Figures 7A-C shows the results of the p53 genetic marker assays 
and reveals a statistically significant decrease (from 6.7% to 3.7%) in the 
frequency of the homozygous polymorphic genotype (P72) in Caucasians 

15 with age (18-59 years of age compared to 60-79 years of age). The 
frequencies of the homozygous "wild-type" genotype (R72) and the 
heterozygous genotype (R72P) for the two age groups are also shown, as 
are the overall frequencies of the R72 and P72 alleles in the two age 
groups (designated as *R72 and *P72, respectively in the Figure). These 

20 results are consistent with the observation that allele is not benign, as 
p53 regulates expression of a second protein, p21, which inhibits 
cyclin-dependent kinases (CDKs) needed to drive cells through the 
cell-cycle (a mutation in either gene can disrupt the cell cycle leading to 
increased cell division). 

25 Figure 2C shows the results of the lipoprotein lipase gene genetic 

marker assays reveals a statistically significant decrease (from 1.97% to 
0.54%) in the frequency of the polymorphic allele (S291) in Caucasian 
males with age (see also Reymer et al. (1995) Nature Genetics 10:28-34). 
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The frequencies of this allele in Caucasian females of different age groups 
are also shown. 

EXAMPLE 2 

This example describes the use of MALDI-TOF mass spectrometry 
5 to analyze DNA samples of a number of subjects as individual samples 
and as pooled samples of multiple subjects to assess the presence or 
absence of a polymorphic allele (the 353Q allele) of the Factor VII gene 
and determine the frequency of the allele in the group of subjects. The 
results of this study show that essentially the same allelic frequency can 
10 be obtained by analyzing pooled DNA samples as by analyzing each 
sample separately and thereby demonstrate the quantitative nature of 
MALDI-TOF mass spectrometry in the analysis of nucleic acids. 
Factor VII 

Factor VII is a serine protease involved in the extrinsic blood 

15 coagulation cascade. This factor is activated by thrombin and works with 

tissue factor (Factor III) in the processing of Factor X to Factor Xa. There 

is evidence that supports an association between polymorphisms in the 

Factor VII gene and increased Factor VII activity which can result in an 

elevated risk of ischemic cardiovascular disease, including myocardial 

20 infarction. The polymorphism investigated in this study is R353Q (i.e., a 

substitution of a glutamic acid residue for an arginine residue at codon 

353 of the Factor VII gene) (see Table 5). 

Analysis of DNA samples for the presence or absence of the 353Q 
allele of the Factor VII gene 

25 

Genomic DNA was isolated from separate blood samples obtained 
from a large number of subjects divided into multiple groups of 92 
subjects per group. Each sample of genomic DNA was analyzed using 
the BiomassPROBE™ assay as described in Example 1 to determine the 
30 presence or absence of the 353Q polymorphism of the Factor VII gene. 
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First, DNA from each sample was amplified in a polymerase chain 
reaction using primers F7-353FUS4 (SEQ ID NO: 24) and F7-353RUS5 
(SEQ ID NO: 26) as shown below and using standard conditions, for 
example, as described in Example 1 . One of the primers was biotinylated 
5 to permit immobilization of the amplification product to a solid support. 
The purified amplification products were immobilized via a biotin-avidin 
linkage to streptavidin-coated beads and the double-stranded DNA was 
denatured. A detection primer was then annealed to the immobilized 
DNA using conditions such as, for example, described in Example 1 . The 
10 detection primer is shown as F7-353-P (SEQ ID NO: 27) below. The 

PROBE extension reaction was carried out using conditions, for example, 
such as those described in Example 1 . The reaction was performed using 
ddG. 

The DNA was denatured to release the extended primers from the 
15 immobilized template. Each of the resulting extension products was 

separately analyzed by MALDI-TOF mass spectrometry. A matrix such as 
3-hydroxypicolinic acid (3-HPA) and a UV laser could be used in the 
MALDI-TOF mass spectrometric analysis. The products resulting from the 
reaction conducted on a "wild-type" allele template (wherein codon 353 
20 encodes an arginine) and from the reaction conducted on a polymorphic 
353Q allele template (wherein codon 353 encodes a glutamic acid) are 
shown below and designated as 353 CGG and 353 CAG, respectively. 
The masses for each product as can be measured by MALDI-TOF mass 
spectrometry are also provided (i.e., 5646.8 Da for the wild-type product 
25 and 5960 Da for the polymorphic product). 

The MALDI-TOF mass spectrometric analyses of the PROBE 
reactions of each DNA sample were first conducted separately on each 
sample (250 nanograms total concentration of DNA per analysis). The 
allelic frequency of the 353Q polymorphism in the group of 92 subjects 
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was calculated based on the number of individual subjects in which it was 
detected. 

Next, the samples from 92 subjects were pooled {250 nanograms 
total concentration of DNA in which the concentration of any individual 
5 DNA is 2.7 nanograms) and the pool of DNA was subjected to MALDI- 
TOF mass spectrometry analysis. The area under the signal 
corresponding to the mass of the 353Q polymorphism PROBE extension 
product in the resulting spectrum was integrated in order to quantitate the 
amount of DNA present. The ratio of this amount to total DNA was used 
10 to determine the allelic frequency of the 353Q polymorphism in the group 
of subjects. This type of individual sample vs. pooled sample analysis 
was repeated for numerous different groups of 92 different samples. 

The frequencies calculated based on individual MALDI-TOF mass 
spectrometry analysis of the 92 separate samples of each group of 92 
15 are compared to those calculated based on MALDI-TOF mass 

spectrometric analysis of pools of DNA from 92 samples in Figure 9. 
These comparisons are shown as "pairs" of bar graphs in the Figure, each 
pair being labeled as a separate "pool" number, e.g., PI, P16, P2, etc. 
Thus, for example, for PI, the allelic frequency of the polymorphism 
20 calculated by separate analysis of each of the 92 samples was 11.41% 
and the frequency calculated by analysis of a pool of all of the 92 DNA 
samples was 12.09%. 

The similarity in frequencies calculated by analyzing separate DNA 
samples individually and by pooling the DNA samples demonstrates that it 
25 is possible, through the quantitative nature of MALDI-TOF mass 

spectrometry, to analyze pooled samples and obtain accurate frequency 
determinations. The ability to analyze pooled DNA samples significantly 
reduces the time and costs involved in the use of the non-selected, 
healthy databases as described herein. It has also been shown that it is 
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possible to decrease the DNA concentration of the individual samples in a 
pooled mixture from 2.7 nanograms to 0.27 nanograms without any 
change in the quality of the spectrum or the ability to quantitate the 
amount of sample detected. 
5 Factor VII R353Q PROBE Assay 

PROBE Assay for cod353 CGG>CAG (Arg>Gln), Exon 9 G>A. 
PCR fragment: 134 bp (incl. US tags; SEQ ID Nos. 22 and 23) 
Frequency of A allele: Europeans about 0.1, Japanese/Chinese about 
0.03-0.05 (Thromb. Haemost. 1995, 73:617-22; Diabetologia 1998, 
10 41:760-6): 

F7-353FUS4> 

1201 GTGCCGGCTA CTCG GATGGC AGCAAGGACT CCTGCAAGGG GGACAGTGGA 
GGCC CACATG 

F7-353-P> A <F7-353RUS5 

15 1261 CCACCCACTA CC GGGGCACG TG GTACCTGA CGGG CATCGT CA GCTGGGGC 
CAGGGCTGCG 

Primers (SEQ ID NOs : 24-26) Tm 9- 
F7-353FUS4 CCC AGT CAC GAC GTT GTA AAA CGA TGG CAG CAA GGA CTC CTG 64 °C 

F7-353-P CAC ATG CCA CCC ACT ACC 

20 F7-3 53RUS5 AGC GGA TAA CAA TTT CAC ACA GGT GAC GAT GCC CGT CAG GTA C 64 °C 

Masses 



Allele 


Product Termination: ddG 


SEQ # 


Length 


Mass 


F7-353-P 


atgccacccactacc 


27 


18 


5333.6 


353 CGG 


cacatgccacccactaccg 


28 


19 


5646.8 


353 CAG 


cacatgccacccactaccag 


29 


20 


5960 


US5-bio bio- 


agcggataacaatttcacacagg 


30 


23 


7648.6 



Conclusion 

The above examples demonstrate an effect of altered frequency of 
30 disease causing genetic factors within the general population. 

Interpretation of those results allows prediction of the medical relevance 
of polymorphic genetic alterations. In addition, conclusions can be drawn 
with regard to their penetrance, diagnostic specificity, positive predictive 
value, onset of disease, most appropriate onset of preventive strategies, 
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and the general applicability of genetic alterations identified in isolated 
populations to panmixed populations. Therefore, an age- and sex-stratified 
population-based sample bank that is ethnically homogenous is a suitable 
tool for rapid identification and validation of genetic factors regarding their 
potential medical utility. 

EXAMPLE 3 
MORBIDITY AND MORTALITY MARKERS 
Sample Band and Initial Screening 

Healthy samples were obtained through the blood bank of San 
Bernardino, CA. Donors signed prior to the blood collection a consent 
form and agreed that their blood will be used in genetic studies with 
regard to human aging. All samples were anomymized. Tracking back of 
samples is not possible. 

Isolation of DNA from blood samples of a healthy donor population 
15 Blood is obtained from a donor by venous puncture and preserved 

with 1mM EDTA pH 8.0. Ten milliliters of whole blood from each donor 
was centrifuged at 2000x g. One milliliter of the buffy coat was added to 
9 milliters of 155mM NH 4 CI, 10mM KHC0 3/ and 0.1 mM Na 2 EDTA, 
incubated 1 0 minutes at room temperature and centrifuged for 1 0 
20 minutes at 2000x g. The supernatant was removed, and the white cell 
pellet was washed in 155mM NH 4 Cl, 10mM KHC0 3 , and 0.1 mM 
Na 2 EDTA and resuspended in 4.5 milliliters of 50mM Tris, 5mM EDTA, 
and 1 % SDS. Proteins were precipitated from the cell lysate by 6M 
Ammonium Acetate, pH 7.3, and separated from the nucleic acid by 
25 centrifugation 3000x g. The nucleic acid was recovered from the 

supernatant by the addition of an equal volume of 100% isopropanol and 
centrifugation at 2000x g. The dried nucleic acid pellet was hydrated in 
lOmM Tris pH 7.6 and 1mM Na2EDTA and stored at 4C. 
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In this study, samples were pooled as shown in Table 1. Both 
parents of the blood donors were of Caucasian origin. 
Table 1 



Pool ID 


Sex 


Age-range 


# individuals 


SP1 


Female 


1 8-39 years 


276 


SP2 


Males 


1 8-39 years 


276 


SP3 


Females 


60-69 years 


184 


SP4 


Males 


60-79 years 


368 



10 More than 400 SNPs were tested using all four pools. After one test run 
34 assays were selected to be re-assayed at least once. Finally, 10 
assays showed repeatedly differences in allele frequencies of several 
percent and, therefore, fulfilled the criteria to be tested using the 
individual samples. Average allele frequency and standard deviation is 

15 tabulated in Table 2. 
Table 2 



Assay ID 


SP1 


SP1-STD 


SP2 


SP2-STD 


SP3 


SP3-STD 


SP4 


SP4-STD 


47861 


0.457 


0.028 


0.433 


0.042 


0.384 


0.034 


0.380 


0.015 


47751 


0.276 


0.007 


0.403 


0.006 


0.428 


0.052 


0.400 


0.097 


48319 


0.676 


0.013 


0.627 


0.018 


0.755 


0.009 


0.686 


0.034 


48070 


0.581 


0.034 


0.617 


0.045 


0.561 


n.a. 


0.539 


0.032 


49807 


0.504 


0.034 


0.422 


0.020 


0.477 


0.030 


0.556 


0.005 


49534 


0.537 


0.017 


0.503 


n.a. 


0.623 


0.023 


0.535 


0.009 


49733 


0.560 


0.006 


0.527 


0.059 


0.546 


0.032 


0.436 


0.016 


49947 


0.754 


0.008 


0.763 


0.047 


0.736 


0.052 


0.689 


0.025 


50128 


0.401 


0.022 


0.363 


0.001 


0.294 


0.059 


0.345 


0.013 
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63306 


0.697 


0.012 


0.674 


0.013 


0.712 


0.017 


0.719 


0.005 



So far, 7 out of the 10 potential morbidity markers were fully 
analyzed. Additional information about genes in which these SNPs are 
5 located was gathered through publicly databases like Genbank. 
AKAPS 

Candidate morbidity and mortality markers include housekeeping 
genes, such as genes involved in signal transduction. Among such genes 
are the A-kinase anchoring proteins (AKAPs) genes, which participate in 

10 signal transduction pathways involving protein phosphorylation. Protein 
phosphorylation is an important mechanism for enzyme regulation and the 
transduction of extracellular signals across the cell membrane in 
eukaryotic cells. A wide variety of cellular substrates, including enzymes, 
membrane receptors, ion channels and transcription factors, can be 

15 phosphorylated in response to extracellular signals that interact with cells. 
A key enzyme in the phosphorylation of cellular proteins in response to 
hormones and neurotransmitters is cyclic AMP (cAMP)-dependent protein 
kinase (PKA). Upon activation by cAMP, PKA thus mediates a variety of 
cellular responses to such extracellular signals. An array of PKA isozymes 

20 are expressed in mammalian cells. The PKAs usually exist as inactive 
tetramers containing a regulatory (R) subunit dimer and two catalytic (C) 
subunits. Genes encoding three C subunits {Co, CJ5 and Cy) and four R 
subunits (Rlor, Rljff, Rlla and Rll£) have been identified [see Takio et aL 
(1982) Proc. NatL Acad. Sci. U.S. A. 75:2544-2548; Lee eta/. (1983) 

25 Proc. NatL Acad. Sci. U.S. A. £0:3608-361 2; Jahnsen et aL (1996) J. 
Biol. Chem. 257:12352-12361; Clegg et aL (1988) Proc. NatL Acad. ScL 
U.S. A. £5:3703-3707; and Scott (1991) Pharmacol. Ther. 50:123-145]. 
The type I (Rl) o and type II (Rll) o subunits are distributed ubiquitously, 
whereas Rlyff and Rlljff are present mainly in brain [see. e.g., Miki and Eddy 
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(1999) J. Biol. Chem. 274:29057-29062]. The type I PKA holoenzyme 
(Rla and RI/?) is predominantly cytoplasmic, whereas the majority of type 
II PKA (Rlla and Rll/?) associates with cellular structures and organelles 
[Scott (1991) Pharmacol. Ther. 50:123-145]. Many hormones and other 
5 signals act through receptors to generate cAMP which binds to the R 
subunits of PKA and releases and activates the C subunits to 
phosphorylate proteins. Because protein kinases and their substrates are 
widely distributed throughout cells, there are mechanisms in place in cells 
to localize protein kinase-mediated responses to different signals. One 

10 such mechanism involves subcellular targeting of PKAs through 

association with anchoring proteins, referred to as A-kinase anchoring 
proteins (AKAPs), that place PKAs in close proximity to specific 
organelles or cytoskeletal components and particular substrates thereby 
providing for more specific PKA interactions and localized responses [see, 

15 e.g., Scott eta/. (1990) J. Biol. Chem. 255:21561-21566; Bregman et al. 
(1991) J. Biol. Chem. 266:7207-7213; and Miki and Eddy (1999) J. Biol. 
Chem. 274:29057-29062]. Anchoring not only places the kinase close to 
preferred substrates, but also positions the PKA holoenzyme at sites 
where it can optimally respond to fluctuations in the second messenger 

20 cAMP [Mochly-Rosen (1995) Science 266:247-251; Faux and Scott 
(1996) Trends Biochem. Sci. 27:312-315; Hubbard and Cohen (1993) 
Trends Biochem. ScL 76:172-177]. 

Up to 75% of type II PKA is localized to various intracellular sites 
through association of the regulatory subunit (Rll) with AKAPs [see, e.g., 

25 Hausken et al. (1996) J. Biol. Chem. 277:29016-29022]. Rll subunits of 
PKA bind to AKAPs with nanomolar affinity [Carr et al. (1992) J. Biol. 
Chem. 267:13376-13382], and many AKAP-RII complexes have been 
isolated from cell extracts. RI subunits of PKA bind to AKAPs with only 
micromolar affinity [Burton et al. (1997) Proc. NatL Acad. Sci. U.S.A. 
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34:1 1067-1 1072], Evidence of binding of a PKA Rl subunit to an AKAP 
has been reported [Miki and Eddy (1998) J. Biol. Chem 273:34384- 
34390] in which Rla-specific and Rla/Rlla dual specificity PKA anchoring 
domains were identified on FSC1/AKAP82. Additional dual specific 
5 AKAPs, referred to as D-AKAP1 and D-AKAP2, which interact with the 
type I and type II regulatory subunits of PKA have also been reported 
IHuang et al. (1997) J. Biol. Chem. 272:8057-8064; Huang et al. (1997) 
Proc. Natl. Acad. ScL U.S.A. 94A 1 1 84-1 1 1 89]. 

More than 20 AKAPs have been reported in different tissues and 

10 species. Complementary DNAs (cDNAs) encoding AKAPs have been 
isolated from diverse species, ranging from Caenorhabditis elegans and 
Drosophilia to human [see, e.g., Colledge and Scott (1999) Trends Cell 
Biol. 5:216-221]. Regions within AKAPs that mediate association with 
Rll subunits of PKA have been identified. These regions of approximately 

15 10-18 amino acid residues vary substantially in primary sequence, but 
secondary structure predictions indicate that they are likely to form an 
amphipathic helix with hydrophobic residues aligned along one face of the 
helix and charged residues along the other [Carr et al. (1991) J. Biol. 
Chem. 266:14188-14192; Carr et aL (1992) J. Biol. Chem. 267:13376- 

20 13382]. Hydrophobic amino acids with a long aliphatic side chain, e.g., 
valine, leucine or isoleucine, may participate in binding to Rll subunits 
[Glantz et aL (1993) J. Biol. Chem. 265:12796-12804]. 

Many AKAPs also have the ability to bind to multiple proteins, 
including other signaling enzymes. For example, AKAP79 binds to PKA, 

25 protein kinase C (PKC) and the protein phosphatase calcineurin (PP2B) 
[Coghlan et al. (1995) Science 267:108-1 1 2 and Klauck et al. (1996) 
Science 277:1589-1592]. Therefore, the targeting of AKAP79 to 
neuronal postsynaptic membranes brings together enzymes with opposite 
catalytic activities in a single complex. 
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AKAPs thus serve as potential regulatory mechanisms that increase 
the selectivity and intensity of a cAMP-mediated response. There is a 
need, therefore, to identify and elucidate the structural and functional 
properties of AKAPs in order to gain a complete understanding of the 
5 important role these proteins play in the basic functioning of cells. 
AKAP10 

The sequence of a human AKAP10 cDNA (also referred to as D- 
AKAP2) is available in the GenBank database, at accession numbers 
AF037439 (SEQ ID NO: 31) and NM 007202. The AKAP10 gene is 

10 located on chromosome 17. 

The sequence of a mouse D-AKAP2 cDNA is also available in the 
GenBank database (see accession number AF021833). The mouse D- 
AKAP2 protein contains an RGS domain near the amino terminus that is 
characteristic of proteins that interact with Ga subunits and possess 

15 GTPase activating protein-like activity [Huang et al. (1997) Proc. Natl. 
Acad. Sci. U.S.A. 54:1 1 184-1 1 189], The human AKAP10 protein also 
has sequences homologous to RGS domains. The carboxy-terminal 40 
resid ues of the mouse D-AKAP2 protein are responsible for the interaction 
with the regulatory subunits of PKA. This sequence is fairly well 

20 conserved between the mouse D-AKAP2 and human AKAP10 proteins. 

Polymorphisms of the human AKAP10 gene and polymorphic AKAP10 
proteins 

Polymorphisms of AKAP genes that alter gene expression, 
regulation, protein structure and/or protein function are more likely to 

25 have a significant effect on the regulation of enzyme (particularly PKA) 
activity, cellular transduction of signals and responses thereto and on the 
basic functioning of cells than polymorphisms that do not alter gene 
and/or protein function. Included in the polymorphic AKAPs provided 
herein are human AKAP1 0 proteins containing differing amino acid 

30 residues at position number 646. 
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Amino acid 646 of the human AKAP10 protein is located in the 
carboxy-terminal region of the protein within a segment that participates 
in the binding of R-subunits of PKAs. This segment includes the carboxy- 
terminal 40 amino acids. 
5 The amino acid residue reported for position 646 of the human 

AKAP10 protein is an isoleucine. Polymorphic human AKAP10 proteins 
provided herein have the amino acid sequence but contain residues other 
than isoleucine at amino acid position 646 of the protein. In particular 
embodiments of the polymorphic human AKAP10 proteins provided 
10 herein, the amino acid at position 646 is a valine, leucine or phenylalanine 
residue. 

An A to G transition at nucleotide 2073 of the human AKAP10 coding 
sequence 

As described herein, an allele of the human AKAP10 gene that 

15 contains a specific polymorphism at position 2073 of the coding 

sequence and thereby encodes a valine at position 646 has been detected 

in varying frequencies in DNA samples from younger and older segments 

of the human population. In this allele, the A at position 2073 of the 

AKAP10 gene coding sequence is changed from an A to a G, giving rise 

20 to an altered sequence in which the codon for amino acid 646 changes 

from ATT, coding for isoleucine, to GTT, coding for valine. 

Morbidity marker 1 : human protein kinase A anchoring protein 
(AKAP10-1) 

PCR Amplification and BiomassPROBE assay detection of AKAP10-1 in a 

25 healthy donor population 

PCR Amplification of donor population for AKAP 10 
PCR primers were synthesized by OPERON using phosphoramidite 
chemistry. Amplification of the AKAP10 target sequence was carried out 
in single 50//I PCR reaction with 100ng-lug of pooled human genomic 

30 DNAs in a 50//I PCR reaction. Individual DNA concentrations within the 
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pooled samples were present in equal concentration with the final 
concentration ranging from 1-25ng. Each reaction containing IX PCR 
buffer (Qiagen, Valencia, CA), 200uM dNTPs, 1U Hotstar Taq 
polymerase (Qiagen, Valencia, CA), 4mM MgCI 2 , and 25pmo) of the 
5 forward primer containing the universal primer sequence and the target 
specific sequence 5'-TCTCAATCATGTGCATTGAGG-3'(SEQ ID NO: 45), 
2pmol of the reverse primer 

5'-AGCGGATAACAATTTCACACAGGGATCACACAGCCATCAGCAG-3' 
(SEQ ID NO: 46), and lOpmol of a biotinylated universal primer 

10 complementary to the 5' end of the PCR amplicon 

5'-AGCGGATAACAATTTCACACAGG-3'{SEQ ID NO: 47). After an initial 
round of amplification with the target with the specific forward and 
reverse primer, the 5' biotinylated universal primer then hybridized and 
acted as a reverse primer thereby introducing a 3' biotin capture moiety 

15 into the molecule. The amplification protocol results in a 5'-biotinylated 
double stranded DNA amplicon and dramatically reduces the cost of high 
throughput genotyping by eliminating the need to 5' biotin label each 
forward primer used in a genotyping. Thermal cycling was performed in 
0.2mL tubes or 96 well plate using an MJ Research Thermal Cycler 

20 (calculated temperature) with the following cycling parameters: 94° C for 
5 min; 45 cycles: 94° C for 20 sec, 56° C for 30 sec, 72° C for 60 sec; 
72° C 3min. 
Immobilization of DNA 

The 50//I PCR reaction was added to 25ul of streptavidin coated magnetic 
25 bead (Dynal) prewashed three times and resuspended in 1M NH 4 CI, 

0.06M NH 4 OH. The PCR amplicons were allowed to bind to the beads for 
1 5 minutes at room temperature. The beads were then collected with a 
magnet and the supernatant containing unbound DNA was removed. The 
unbound strand was release from the double stranded amplicons by 
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incubation in 100mM NaOH and washing of the beads three times with 
10mM Tris pH 8.0. 

BiomassPROBE assay analysis of donor population for AKAP10-1 (clone 
48319) 

5 Genotyping using the BiomassPROBE assay methods was carried 

out by resuspending the DNA coated magnetic beads in 26mM Tris-HCI 
pH 9.5, 6.5 mM MgCI 2 and 50mM each of dTTP and 50mM each of 
ddCTP, ddATP, ddGTP, 2.5U of a thermostable DNA polymerase 
(Ambersham) and 20pmol of a template specific oligonucleotide PROBE 

10 primer 5'-CTGGCGCCCACGTGGTCAA-3' (SEQ ID NO: 48) (Operon). 
Primer extension occurs with three cycles of oligonucleotide primer 
hybridization and extension. The extension products were analyzed after 
denaturation from the template with 50mM NH 4 CI and transfer of 1 50nL 
each sample to a silicon chip preloaded with 150nL of H3PA matrix 

15 material. The sample material was allowed to crystallize and was 

analyzed by MALDI-TOF (Bruker, PerSeptive). The SNP that is present in 
AKAP10-1 is a T to C transversion at nucleotide number 156277 of the 
sequence of a genomic clone of the AKAP10 gene (GenBank Accession 
No. AC005730) (SEQ ID NO: 36). SEQ ID NO: 35: represents the 

20 nucleotide sequence of human chromosome 17, which contains the 

genomic nucleotide sequence of the human AKAP10 gene, and SEQ ID 
NO: 36 represents the nucleotide sequence of human chromosome 17, 
which contains the genomic nucleotide sequence of the human AKAP10- 
1 allele. The mass of the primer used in the BioMass probe reaction was 

25 5500.6 daltons. In the presence of the SNP, the primer is extended by 
the addition of ddC, which has a mass of 5773.8. The wildtype gene 
results in the addition of dT and ddG to the primer to produce an 
extension product having a mass of 6101 daltons. 
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The frequency of the SIMP was measured in a population of age 
selected healthy individuals. Five hundred fifty-two (552) individuals 
between the ages of 18-39 years (276 females, 276 males) and 552 
individuals between the ages of 60-79 (184 females between the ages of 
5 60-69, 368 males between the age of 60-79) were tested for the 

presence of the polymorphism localized in the non-translated 3'region of 
AKAP 10. Differences in the frequency of this polymorphism with 
increasing age groups were observed among healthy individuals. 
Statistical analysis showed that the significance level for differences in 
10 the allelic frequency for alleles between the "younger" and the "older" 
populations was p = 0.0009 and for genotypes was p = 0.003. 
Differences between age groups are significant. For the total population 
allele significance is p = 0.0009, and genotype significance is p = 0.003. 
This marker led to the best significant result with regard to allele 
15 and genotype frequencies in the age-stratified population. Figure 19 

shows the allele and genotype frequency in both genders as well as in the 
entire population. For latter the significance for alleles was p = 0.0009 
and for genotypes was p = 0.003. The young and old populations were in 
Hardy-Weinberg equilibrium. A preferential change of one particular 
20 genotype was not seen. 

The polymorphism is localized in the non-translated 3'-region of the 
gene encoding the human protein kinase A anchoring protein (AKAP10). 
The gene is located on chromosome 17. Its structure includes 15 exons 
and 14 intervening sequences (introns). The encoded protein is 
25 responsible for the sub-cellular localization of the cAMP-dependent protein 
kinase and, therefore, plays a key role in the G-protein mediated receptor- 
signaling pathway (Huang et al. PNAS (1007) 94:11184-11189). Since 
its localization is outside the coding region, this polymorphism is most 
likely in linkage disequilibrium (LD) with other non-synonymous 
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polymorphisms that could cause amino acid substitutions and 
subsequently alter the function of the protein. Sequence comparison of 
different Genbank database entries concerning this gene revealed further 
six potential polymorphisms of which two are supposed to change the 
5 respective amino acid (see Table 3). 
Table 3 



Exbn 


Codon 


Nucleotides 


Amino acid 


3 


100 


GCT>GCC 


Ala > Ala 


4 


177 


AGT> GTG 


Met>Val 


8 - 


424 


GGG>GGC 


Gly>Gly 


10 


524 


CCG > CTG 


Pro > Leu 


12 


591 


GTG > GTC 


Val>Val 


12 


599 


CGC>CGA 


Arg > Arg 



15 Morbitity marker 2: human protein kinase A anchoring protein 
(AKAP10-5) 

Discovery of AKAP10-5 Allele (SEQ ID NO: 33) 

Genomic DNA was isolated from blood (as described above) of 
seventeen (17) individuals with a genotype CC at the AKAP10-1 gene 

20 locus and a single heterozygous individual (CT) (as described). A target 
sequence in the AKAP10-1 gene which encodes the C-terminal PKA 
binding domain was amplified using the polymerase chain reaction. PCR 
primers were synthesized by OPERON using phosphoramidite chemistry. 
Amplification of the AKAP10-1 target sequence was carried out in 

25 individual 50p\ PCR reaction with 25ng of human genomic DNA 

templates. Each reaction containing I X PCR buffer (Qiagen, Valencia, 
CA), 200/jM dNTPs, IU Hotstar Taq polymerase (Qiagen, Valencia, CA), 
4mM MgCl 2 , 25pmol of the forward primer (Ex13F) containing the 
universal primer sequence and the target specific sequence 5'-TCC CAA 

30 AGT GCT GGA ATT AC-3' (SEQ ID NO: 53), and 2pmol of the reverse 
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primer (Ex14R) 5'-GTC CAA TAT ATG CAA ACA GTT G-3' (SEQ ID NO: 
54). Thermal cycling was performed in 0.2mL tubes or 96 well plate 
using an MJ Research Thermal Cycler (MJ Research, Waltham, MA) 
(calculated temperature) with the following cycling parameters: 94° C for 
5 5 min; 45 cycles; 94° C for 20 sec, 56° C for 30 sec, 72° C for 60 sec; 
72° C 3min. After amplification the amplicons were purified using a 
chromatography (Mo Bio Laboratories (Solana Beach, CA)). 

The sequence of the 18 amplicons, representing the target region, 
was determined using a standard Sanger cycle sequencing method with 

10 25nmol of the PCR arnplicon, 3.2uM DNA sequencing primer 5'-CCC ACA 
GCA GTT AAT CCT TC-3'(SEQ ID NO: 55), and chain terminating 
dRhodamine labeled 2', 3' dideoxynucleotides (PE Biosystems, Foster 
City, CA) using the following cycling parameters: 96° C for 1 5 seconds; 
25 cycles: 55° C for 15 seconds, 60° C for 4 minutes. The sequencing 

15 products precipitated by 0.3M NaOAc and ethanol. The precipitate was 
centrifuged and dried. The pellets were resuspended in deionized 
formamide and separated on a 5% polyacrylimide gel. The sequence was 
determined using the "Sequencher" software (Gene Codes, Ann Arbor, 
Ml). 

20 The sequence of all 1 7 of the amplicons, which are homozygous 

for the AKAP10-1 SNP of the amplicons, revealed a polymorphism at 
nucleotide position 152171 (numbering for GenBank Accession No. 
AC005730 for AKAP10 genomic clone (SEQ ID NO: 35)) with A replaced 
by G. This SNP can also be designated as located at nucleotide 2073 of 

25 a cDNA clone of the wildtype AKAP10 (GenBank Accession No. 

AF037439) (SEQ ID NO: 31). The amino acid sequence of the human 
AKAP10 protein is provided as SEQ ID NO: 32. This single nucleotide 
polymorphism was designated as AKAP10-5 (SEQ ID NO: 33) and 
resulted in a substitution of a valine for an isoleucine residue at amino 
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acid position 646 of the amino acid sequence of human AKAP10 (SEQ ID 
NO: 32). 

PCR Amplification and BiomassPROBE assay detection of AKAP10-5 in a 
healthy donor population 

5 The healthy population stratified by age is a very efficient and a 

universal screening tool for morbidity associated genes by allowing for the 

detection of changes of allelic frequencies in the young compared to the 

old population. Individual samples of this healthy population base can be 

pooled to further increase the throughput. 

10 Healthy samples were obtained through the blood bank of San 

Bernardino, CA. Both parents of the blood donors were of Caucasian 
origin. Practically a healthy subject, when human, is defined as human 
donor who passes blood bank criteria to donate blood for eventual use in 
the general population. These criteria are as follows: free of detectable 

15 viral, bacterial, mycoplasma, and parasitic infections; not anemic; and 
then further selected based upon a questionnaire regarding history (see 
Figure 3). Thus, a healthy population represents an unbiased population 
of sufficient health to donate blood according to blood bank criteria, and 
not further selected for any disease state. Typically such individuals are 

20 not taking any medications. 

PCR primers were synthesized by OPERON using phosphoramidite 
chemistry. Amplification of the AKAP10 target sequence was carried out 
in a single 50//I PCR reaction with 100ng- Ijjg of pooled human genomic 
DNAs in a 50//I PCR reaction. Individual DNA concentrations within the 

25 pooled samples were present in equal concentration with the final 

concentration ranging from 1-25ng. Each reaction contained 1X PCR 
buffer (Qiagen, Valencia, CA), 200//M dNTPs, 1U Hotstar Taq polymerase 
(Qiagen, Valencia, CA), 4mM MgCI 2 , and 25pmol of the forward primer 
containing the universal primer sequence and the target specific 

30 sequence 5'-AGCGGATAACAATTTCACACAGGGAGCTAGCTTGGAAGAT 
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TGC-3' (SEQ ID NO: 41), 2pmol of the reverse primer 

5'-GTCCAATATATGCAAACAGTTG-3' (SEQ ID NO: 54), and lOpmol of a 
biotinylated universal primer complementary to the 5' end of the PCR 
amplicon BIO:5'-AGCGGATAACAATTTCACACAGG-3' (SEQ ID NO: 43). 
5 After an initial round of amplification with the target with the specific 
forward and reverse primer, the 5' biotinylated universal primer can then 
be hybridized and acted as a forward primer thereby introducing a 5' 
biotin capture moiety into the molecule. The amplification protocol 
resulted in a 5'-biotinylated double stranded DNA amplicon and 
10 dramatically reduced the cost of high throughput genotyping by 

eliminating the need to 5' biotin label every forward primer used in a 
genotyping. 

Themal cycling was performed in 0.2mL tubes or 96 well plate 
using an MJ Research Thermal Cycler (calculated temperature) with the 

15 following cycling parameters: 94° C for 5 min; 45 cycles: 94° C for 20 
sec, 56° C for 30 sec; 72° C for 60 sec; 72° C 3min. 
Immobilization of DNA 

The 50 /j\ PCR reaction was added to 25//L of streptavidin coated 
magnetic beads (Dynal, Oslo, Norway), which were prewashed three 

20 times and resuspended in 1M NH 4 CI, 0.06M NH 4 0H. The 5' end of one 
strand of the double stranded PCR amplicons were allowed to bind to the 
beads for 1 5 minutes at room temperature. The beads were then 
collected with a magnet and the supernatant containing unbound DNA 
was removed. The hybridized but unbound strand was released from the 

25 double stranded amplicons by incubation in 100mM NaOH and washing 
of the beads three times with 10mM Tris pH 8.0. 
Detection of AKAP10-5 using BiomassPROBE™ Assay 

BiomassPROBE™ assay of primer extension analysis (see, U.S. 
Patent No. 6,043,031) of donor population for AKAP 10-5 (SEQ ID NO: 
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33) was performed. Genotyping using these methods was carried out by 
resuspending the DNA coated magnetic beads in 26mM Tris-HCL pH 9.5, 
6.5 mM MgCl 2 , 50mM dTTP, 50mM each of ddCTP, ddATP, ddGTP, 2.5U 
of a thermostable DNA polymerase (Ambersham), and 20pmol of a 
5 template specific oligonucleotide PROBE primer 

5'-ACTGAGCCTGCTGCATAA-3' (SEQ ID NO: 44) (Operon). Primer 
extension occurs with three cycles of oligonucleotide primer with 
hybridization and extension. The extension products were analyzed after 
denaturation from the template with 50 mM NH 4 CI and transfer of 150 nL 

10 of each sample to a silicon chip preloaded with 150 nl of H3PA matrix 
material. The sample material was allowed to crystallize and analyzed by 
MALDl-TOF (Bruker, PerSeptive). The primer has a mass of 5483.6 
daltons. The SNP results in the additional of a ddC to the primer, giving a 
mass of 5756.8 daltons for the extended product. The wild type results in 

15 the addition a T and ddG to the primer giving a mass of 6101 daltons. 

The frequency of the SNP was measured in a population of age 

selected healthy individuals. Seven hundred thirteen (713) individuals 

under 40 years of age (360 females, 353 males) and 703 individuals over 

60 years of age (322 females, 381 males) were tested for the presence of 

20 the SNP, AKAP10-5 (SEQ ID NO: 33). Results are presented below in 
Table 1 . 



TABLE 1 

AKAP10-5 (2073V) frequency comparison in 2 age groups 








<40 


>60 


delta G allele 


Female 


Alleles 


*G 


38.6 


34.6 


4.0 






*A 


61.4 


65.4 






Genotypes 


G 


13.9 


1 1.8 


2.1 






GA 


49.4 


45.7 








A 


36.7 


42.5 
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— — 

1 








Male 


Alleles 


*G 


41.4 


37.0 


4.4 






*A 


58.6 


63.0 






Genotypes 


G 


18.4 


10.8 


7.7 






GA 


45.9 


52.5 








A 


35.7 


36.7 
















Total 


Alleles 


*G 


40.0 


35.9 


4.1 






*A 


60.0 


64.1 






Genotypes 


G 


16.1 


11.2 


4.9 






GA 


47.7 


49.4 








A 


36.2 


39.4 





Figure 20 graphically shows these results of allele and genotype 
15 distribution in the age and sex stratified Caucasian population. 

Morbidity marker 3: human methionine sulfoxide reductase A (msrA) 
The age-related allele and genotype frequency of this marker in 
both genders and the entire population is shown in Figure 21. The 
decrease of the homozygous CC genotype in the older male population is 
20 highly significant. 

Methionine sulfoxide reductase A (#63306) 

PCR Amplification and BiomassPROBE assay detection of the human 
methioine sulfoxid reductase A (h-msr-A) in a healthy donor population 
PCR Amplification of donor population for h-msr-A 
25 PCR primers were synthesized by OPERON using phosphoramidite 

chemistry. Amplification of the AKAP10 target sequence was carried out 
in single 50//I PCR reaction with 100ng-1ug of pooled human genomic 
DNA templates in a 50//I PCR reaction. Individual DNA concentrations 
within the pooled samples were present in an equal concentration with 
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the final concentration ranging from 1-25ng. Each reaction containing I X 
PCR buffer (Qiagen, Valencia, CA), 200/vM dNTPs, 1U Hotstar Taq 
polymerase (Qiagen, Valencia, CA), 4mM MgCI 2/ 25pmol of the forward 
primer containing the universal primer sequence and the target specific 
5 sequence 5'-TTTCTCTGCACAGAGAGGC-3' (SEQ ID NO: 49), 2pmol of 
the reverse primer 

5'-AGCGGATAACAATTTCACACAGGGCTGAAATCCTTCGCTTTACC-3' 
(SEQ ID NO: 50), and lOpmol of a biotinylated universal primer 
complementary to the 5' end of the PGR amplicon 

10 5'-AGCGGATAACAATTTCACACAGG-3' (SEQ ID NO: 51). After an initial 
round of amplification of the target with the specific forward and reverse 
primers, the 5' biotinylated universal primer was then hybridized and 
acted as a reverse primer thereby introducing a 3' biotin capture moiety 
into the molecule. The amplification protocol results in a 5'-biotinylated 

15 double stranded DNA amplicon and and dramatically reduces the cost of 
high throughput genotyping by eliminating the need to 5' biotin label each 
forward primer used in a genotyping. Thermal cycling was performed in 
0.2mL tubes or 96 well plate using an MJ Research Thermal Cycler 
(calculated temperature) with the following cycling parameters: 94° C for 

20 5 min; 45 cycles: 94° C for 20 sec, 56° C for 30 sec, 72° C for 60 sec; 
72° C 3min. 
Immobilization of DNA 

The 50/vl PCR reaction was added to 25ul of streptavidin coated 
magnetic bead (Dynal) prewashed three times and resuspended in 1M 

25 NH 4 CI, 0.06M NH 4 0H. The PCR amplicons were allowed to bind to the 
beads for 15 minutes at room temperature. The beads were then 
collected with a magnet and the supernatant containing unbound DNA 
was removed. The unbound strand was release from the double stranded 
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amplicons by incubation in 100mM NaOH and washing of the beads three 
times with 10mM Tris pH 8.0. 

BiomassPROBE assay analysis of donor population for h-msr A 

Genotyping using the BiomassPROBE assay methods was carried 
5 out by resuspending the he DNA coated magnetic beads in 26mM 
Tris-HCI pH 9.5, 6.5 mM MgCl 2 , 50mM of dTTPs and 50mM each of 
ddCTP, ddATP, ddGTP, 2.5U of a thermostable DNA polymerase 
(Ambersham), and 20pmol of a template specific oligonucleotide PROBE 
primer 5'-CTGAAAAGGGAGAGAAAG-3' (Operon) (SEQ ID NO: 52). 

10 Primer extension occurs with three cycles of oligonucleotide primer with 
hybridization and extension. The extension products were analyzed after 
denaturation from the template with 50mM NH 4 CI and transfer of 150nl 
each sample to a silicon chip preloaded with 1 50nl of H3PA matrix 
material. The sample material was allowed to crystallize and analyzed by 

15 MALDI-TOF (Bruker, PerSeptive). The SNP is represented as a T to C 

tranversion in the sequence of two ESTs. The wild type is represented by 
having a T at position 128 of GenBank Accession No. AW 195104, 
which represents the nucleotide sequence of an EST which is a portion of 
the wild type human msrA gene (SEQ ID NO: 39 ). The SNP is presented 

20 as a C at position 129 of GenBank Accession No. AW 874187, which 
represents the nucleotide sequence of an EST which is a portion of an 
allele of the human msrA gene (SEQ ID NO: 40 ). 

In a genomic sequence the SNP is represented as an A to G 
transversion. The primer utilized in the BioMass probe reaction had a 

25 mass of 5654.8 daltons. In the presence of the SNP the primer is 

extended by the incorporation of a ddC and has a mass of 5928. In the 
presence of the wildtype the primer is extended by adding a dT and a 
DDC to produce a mass of 6232.1 daltons. 
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The frequency of the SNP was measured in a population of age 
selected healthy individuals. Five hundred fifty-two (552) individuals 
between the ages of 18-39 years (276 females, 276 males and 552 
individuals between the age of 60-79 (184 females between the ages of 
5 60-69, 368 males between the age of 60-79) were tested for the 

presence of the polymorphism localized in the nontranslated 3'region of 
h-msr-A. 

Genotype difference between male age group among healthy 
individuals is significant. For the male population allele significance is 

10 p = 0.0009 and genotype significance is p = 0.003. The age-related allele 
and genotype frequency of this marker in both genders and the entire 
population is shown in Figure 21 . The decrease of the homozygous CC 
genotype in the older male population is highly significant. 

The polymorphism is localized in the non-translated 3'-region of the 

15 gene encoding the human methionine sulfoxide reductase (h-msrA). The 
exact localization is 451 base pairs downstream the stop codon (TAA). It 
is very likely that this SNP is in linkage disequilibrium (LD) with another 
polymorphism more upstream in the coding or promoter region; thus, it is 
not directly cause morbidity. The enzyme methionine sulfoxide reductase 

20 has been proposed to exhibit multiple biological functions. It may serve 
to repair oxidative protein damage but also play an important role in the 
regulation of proteins by activation or inactivation of their biological 
functions (Moskovitz et al. (1990) PNAS 95:14071-14075). It has also 
been shown that its activity is significantly reduced in brain tissues of 

25 Alzheimer patients (Gabbita et al., (1999) J. Neurochem 73:1660-1666). 
It is scientifically conceivable that proteins involved in the metabolism of 
reactive oxygen species are associated to disease. 
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CONCLUSION 

The use of the healthy population provides for the identification of 
morbidity markers. The identification of proteins involved in the G-protein 
coupled signaling transduction pathway or in the detoxification of 
5 oxidative stress can be considered as convincing results. Further 
confirmation and validation of other potential polymorphisms already 
identified in silico in the gene encoding the human protein kinase A 
anchoring protein could even provide stronger association to morbidity 
and demonstrate that this gene product is a suitable pharmaceutical or 
10 diagnostic target. 

EXAMPLE 4 
MALDI-TOF Mass Spectrometry Analysis 

All of the products of the enzyme assays listed below were 
analyzed by MALDI-TOF mass spectrometry. A diluted matrix solution 
15 (0.15//L) containing of 10:1 3-hydroxypicolinic acidrammonium citrate in 
1:1 watenacetonitrile diluted 2.5-fold with water was pipetted onto a 
SpectroChip (Sequenom, Inc.) and was allowed to crystallize. Then, 
0.1 5//L of sample was added. A linear PerSeptive Voyager DE mass 
spectrometer or Bruker Biflex MALDI-TOF mass spectrometer, operating in 
20 positive ion mode, was used for the measurements. The sample plates 
were kept at 1 8.2 kV for 400 nm after each UV laser shot (approximate 
250 laser shots total), and then the target voltage was raised to 20 kV. 
The original spectra were digitized at 500 MHz. 

EXAMPLE 5 

25 Sample Conditioning 

Where indicated in the examples below, the products of the 
enzymatic digestions were purified with ZipTips (Millipore, Bedford, MA). 
The ZipTips were pre-wetted with 10 jjL 50% acetonitrile and equilibrated 
4 times with 10 //I 0.1 M TEAAc. The oligonucleotide fragments were 
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bound to the C18 in the ZipTip material by continuous aspiration and 
dispension of each sample into the ZipTip. Each digested oligonucleotide 
was conditioned by washing with 10//L 0.1 M TEAAc, followed by 4 
washing steps with 10 /jL H 2 0. DNA fragments were eluted from the 
5 Ziptip with 7 jjL 50% acetonitrile. 

Any method for condition the samples may be employed. Methods 
for conditioning, which generally is used to increase peak resolution, are 
well known (see, e.g., International PCT application No. WO 98/20019). 

EXAMPLE 6 

10 DNA Glycosylase-Mediated Sequence Analysis 

DNA Glycosylases modifies DNA at each position that a specific 
nucleobase resides in the DNA, thereby producing abasic sites. In a 
subsequent reaction with another enzyme, a chemical, or heat, the 
phosphate backbone at each abasic site can be cleaved. 

15 The glycosylase utilized in the following procedures was uracil-DNA 

glycosylase (UDG). Uracil bases were incorporated into DNA fragments in 
each position that a thymine base would normally occupy by amplifying a 
DNA target sequence in the presence of uracil. Each uracil substituted 
DNA amplicon was incubated with UDG, which cleaved each uracil base 

20 in the amplicon, and was then subjected to conditions that effected 

backbone cleavage at each abasic site, which produced DNA fragments. 
DNA fragments were subjected to MALDI-TOF mass spectrometry 
analysis. Genetic variability in the target DNA was then assessed by 
analyzing mass spectra. 

25 Glycosylases specific for nucleotide analogs or modified 

nucleotides, as described herein, can be substituted for UDG in the 
following procedures. The glycosylase methods described hereafter, in 
conjunction with phosphate backbone cleavage and MALDl, can be used 
to analyze DNA fragments for the purposes of SNP scanning, bacteria 
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typing, methylation analysis, microsatellite analysis, genotyping, and 
nucleotide sequencing and re-sequencing. 
A. Genotyping 

A glycosylase procedure was used to genotype the DNA sequence 
5 encoding UCP-2 (Uncoupling Protein 2). The sequence for UCP-2 is 

deposited in GenBank under accession number AF096289. The sequence 
variation genotyped in the following procedure was a cytosine (C-allele) to 
thymine (T-allele) variation at nucleotide position 4790, which results in a 
alanine to valine mutation at position 55 in the UCP-2 polypeptide. 

10 DNA was amplified using a PCR procedure with a 50 /jL reaction 

volume containing of 5 pmol biotinylated primer having the sequence 5'- 
TGCTTATCCCTGTAGCTACCCTGTCTTGGCCTTGCAGATCCAA-3' (SEQ 
ID NO: 91), 15 pmol non-biotinylated primer having the sequence 5'- 
AGCGGATAACAATTTCACACAGGCCATCACACCGCGGTACTG-3' (SEQ 

15 ID NO: 92), 200 fjM dATP, 200 /yM dCTP, 200 //M dGTP, 600 jjM dUTP 
(to fully replace dTTP), 1.5 mM to 3 mM MgCI 2 , 1 U of HotStarTaq 
polymerase, and 25 ng of CEPH DNA. Amplification was effected with 
45 cycles at an annealing temperature of 56 °C. 

The amplification product was then immobilized onto a solid 

20 support by incubating 50 jjL of the amplification reaction with 5 /jL of 
prewashed Dynabeads for 20 minutes at room temperature. The 
supernatant was removed, and the beads were incubated with 50 /jL of 
0.1 M NaOH for 5 minutes at room temperature to denature the double- 
stranded PCR product in such a fashion that single-stranded DNA was 

25 linked to the beads. The beads were then neutralized by three washes 

with 50 //L 10 mM TrisHCI (pH 8). The beads were resuspended in 10 fjL 
of a 60mM TrisHCI/ImM EDTA (pH 7.9) solution, and 1 U uracil DNA 
glycosylase was add d to the solution for 45 minutes at 37 °C to remove 
uracil nucleotides present in the single-stranded DNA linked to the beads. 

\ 
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The beads were then washed two times with 25 /yL of 10 mM TrisHCI 
(pH 8) and once with 10 //L of water. The biotinylated strands were then 
eluted from the beads with 12/vL of 2 M NH 4 OH at 60°C for 10 minutes. 
The backbone of the DNA was cleaved by incubating the samples for 1 0 
5 min at 95 °C (with a closed lid), and ammonia was evaporated from the 
samples by incubating the samples for 1 1 min at 80 °C. 

The cleavage fragments were then analyzed by MALDI-TOF mass 
spectrometry as described in Example 4. The T-allele generated a unique 
fragment of 3254 Daltons. The C-allele generated a unique fragment of 

10 4788 Daltons. These fragements were distinguishable in mass spectra. 
Thus, the above-identified procedure was successfully utilized to 
genotype individuals heterozygous for the C-allele and T-allele in UCP-2. 
B. Glycosylase Analysis Utilizing Pooled DNA Samples 

The glycosylase assay was conducted using pooled samples to 

15 detect genetic variability at the UCP-2 locus. DNA of known genotype 
was pooled from eleven individuals and was diluted to a fixed 
concentration of 5 ng/yt/L. The procedure provided in Example 3A was 
followed using 2 pmol of forward primer having a sequence of 5'- 
CCCAGTCACGACGTTGTAAAACGTCTTGGCCTTGCAGATCCAAG- 3' 

20 (SEQ ID NO: 93) and 15 pmol of reverse primer having the sequence 5'- 
AGCGGATAACAATTTCACACAGGCCATCACACCGCGGTACTG-3' (SEQ 
ID NO: 94). In addition, 5 pmol of biotinylated primer having the 
sequence 5'bioCCCAGTCACGACGTTGTAAAACG 3' (SEQ ID NO: 97) 
may be introduced to the PCR reaction after about two cycles. The 

25 fragments were analyzed via MALDI-TOF mass spectroscopy (Example 4). 
As determined in Example 3A, the T-allele, which generated a unique 
fragment of 3254 Daltons, could be distinguished in mass spectra from 
the C-allele, which generated a unique fragment of 4788 Daltons. Allelic 
frequency in the pooled samples was quantified by integrating the area 
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under each signal corresponding to an allelic fragment. Integration was 
accomplished by hand calculations using equations well known to those 
skilled in the art. In the pool of eleven samples, this procedure suggested 
that 40.9% of the individuals harbored the T allele and 59.09% of the 
5 individuals harbored the C allele. 

C. Glycosylase-Mediated Microsatellite Analysis 

A glycosylase procedure was utilized to identify microsatellites of 
the Bradykinin Receptor 2 (BKR-2) sequence. The sequence for BKR-2 is 
deposited in GenBank under accession number X86173. BKR-2 includes 

10 a SNP in the promoter region, which is a C to T variation, as well as a 
SNP in a repeated unit, which is a G to T variation. The procedure 
provided in Example 3A was utilized to identify the SNP in the promotor 
region, the SNP in the microsattelite repeat region, and the number of 
repeated units in the microsattelite region of BKR-2. Specifically, a 

15 forward PCR primer having the sequence 5'- 

CTCCAGCTGGGCAGGAGTGC-3' (SEQ ID NO: 95) and a reverse primer 
having the sequence 5'-CACTTCAGTCGCTCCCT-3' (SEQ ID NO: 96) 
were utilized to amplify BKR-2 DNA in the presence of uracil. The 
amplicon was fragmented by UDG followed by backbone cleavage. The 

20 cleavage fragments were analyzed by MALDI-TOF mass spectrometry as 
described in Example 4. 

With regard to the SNP in the BKR-2 promotor region having a C to 
T variation, the C-allele generated a unique fragment having a mass of 
7342.4 Daltons and the T-allele generated a unique fragment having a 

25 mass of 7053.2 Daltons. These fragments were distinguishable in mass 
spectra. Thus, the above-identified procedure was successfully utilized to 
genotype individuals heterozygous for the C-allele and T-allele in the 
promotor region of BKR-2. 
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With regard to the SNP in the BKR-2 repeat region having a G to T 
variation, the T-allele generated a unique fragment having a mass of 1 784 
Daltons, which was readily detected in a mass spectrum. Hence, the 
presence of the T-ailele was indicative of the G to T sequence variation in 
5 the repeat region of BKR-2. 

In addition, the number of repeat regions was distinguished 
between individuals having two repeat sequences and individuals having 
three repeat sequences in BKR-2. The DNA of these individuals did not 
harbor the G to T sequence variation in the repeat sequence as each 

10 repeat sequence contained a G at the SNP locus. The number of repeat 
regions was determined in individual samples by calculating the area 
under a signal corresponding to a unique DNA fragment having a mass of 
2771.6 Daltons. This signal in spectra generated from individuals having 
two repeat regions had an area that was thirty-three percent less than the 

15 area under the same signal in spectra generated from individuals having 
three repeat regions. Thus, the procedures discussed above can be 
utilized to genotype individuals for the number of repeat sequences 
present in BKR-2. 

D. Bisulfite Treatment Coupled with Glycosylase Digestion 
20 Bisulfite treatment of genomic DNA can be utilized to analyze 

positions of methylated cytosine residues within the DNA. Treating 
nucleic acids with bisulfite deaminates cytosine residues to uracil 
residues, while methylated cytosine remains unmodified. Thus, by 
comparing the sequence of a PCR product generated from genomic DNA 
25 that is not treated with bisulfite with the sequence of a PCR product 

generated from genomic DNA that is treated with bisulfite, the degree of 
methylation in a nucleic acid as well as the positions where cytosine is 
methylated can be deduced. 
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Genomic DNA (2 //g) was digested by incubation with 1 jjL of a 
restriction enzyme at 37°C for 2 hours. An aliquot of 3 M NaOH was 
added to yield a final concentration of 0.3M NaOH in the digestion 
solution. The reaction was incubated at 37°C for 15 minutes followed by 
5 treatment with 5.35M urea, 4.44M bisulfite, and 10mM hydroquinone, 
where the final concentration of hydroquinone is 0.5 mM. 

The sample that was treated with bisulfite (sample A) was 
compared to the same digestion sample that had not undergone bisulfite 
treatment (sample B). After sample A was treated with bisulfite as 

10 described above, sample A and sample B were amplified by a standard 
PCR procedure. The PCR procedure included the step of overlaying each 
sample with mineral oil and then subjecting the sample to thermocycling 
(20 cycles of 15 minutes at 55 °C followed by 30 seconds at 95 °C). The 
PCR reaction contained four nucleotide bases, C, A, G, and U. The 

15 mineral oil was removed from each sample, and the PCR products were 
purified with glassmilk. Sodium iodide (3 volumes) and glassmilk (5 jjL) 
were added to samples A and B. The samples were then placed on ice 
for 8 minutes, washed with 420 jjL cold buffer, centrif uged for 1 0 
seconds, and the supernatant fractions were removed. This process was 

20 repeated twice and then 25 jjL of water was added. Samples were 

incubated for 5 minutes at 37 °C, were centrifuged for 20 seconds, and 
the supernatant fraction was collected, and then this 
incubation/centrifugation/supernatant fraction collection procedure was 
repeated. 50 //L 0.1 M NaOH was then added to the samples to denature 

25 the DNA. The samples were incubated at room temperature for 5 

minutes, washed three times with 50 jjL of 10 mM TrisHCI (pH 8), and 
resuspended in 10 jjL 60mM TrisHCI/1mM EDTA, pH 7-9. 

The sequence of PCR products from sample A and sample B were 
then treated with 2U of UDG (MBI Fermentas) and then subjected to 
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backbone cleavage, as described herein. The resulting fragments from 
each of sample A and sample B were analyzed by MALDI-TOF mass 
spectroscopy as described in Example 4. Sample A gave rise to a greater 
number of fragments than the number of fragments arising from sample 
5 B, indicative that the nucleic acid harbored at least one methylated 
cytosine moiety. 

EXAMPLE 7 
Fen-Ligase-Mediated Haplotyping 

Haplotyping procedures permit the selection of a fragment from one of an 

10 individual's two homologous chromosomes and to genotype linked SNPs 
on that fragment. The direct resolution of haplotypes can yield increased 
information content, improving the diagnosis of any linked disease genes 
or identifying linkages associated with those diseases. In previous 
studies, haplotypes were typically reconstructed indirectly through 

15 pedigree analysis (in cases where pedigrees were available) through 
laborious and unreliable allele-specific PCR or through single-molecule 
dilution methods well known in the art. 

A haplotyping procedure was used to determine the presence of 
two SNPs, referred to as SNP1 and SNP2, located on one strand in a DNA 

20 sample. The haplotyping procedure used in this assay utilized Fen-1, a 
site-specific "flap" endonuclease that cleaves DNA "flaps" created by the 
overlap of two oligonucleotides hybridized to a target DNA strand. The 
two overlapping oligonucleotides in this example were short arm and long 
arm allele-specific adaptors. The target DNA was an amplified nucleic 

25 acid that had been denatured and contained SNP1 and SNP2. 

The short arm adaptor included a unique sequence not found in the 
target DNA. The 3' distal nucleotide of the short arm adaptor was 
identical to one of the SNP1 alleles. Moreover, the long arm adaptor 
included two regions: a 3' region complementary to the short arm and a 
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5'gene-specific region complementary to the fragment of interest adjacent 
to the SNP. If there was a match between the adaptor and one of the 
homologues, the Fen enzyme recognized and cleaved the overlapping 
flap. The short arm of the adaptor was then ligated to the remainder of 
5 the target fragment (minus the SNP site). This ligated fragment was used 
as the forward primer for a second PCR reaction in which only the ligated 
homologue was amplified. The second PCR product (PCR2) was then 
analyzed by mass spectrometry. If there was no match between the 
adaptors and the target DNA, there was no overlap, no cleavage by Fen- 
10 1, and thus no PCR2 product of interest. 

If there was more than one SNP in the sequence of interest, the 
second SNP (SNP2) was found by using an adaptor that was specific for 
SNP2 and hybridizing the adaptor to the PCR2 product containing the first 
SNP. The Fen-ligase and amplification procedures were repeated for the 
15 PCR2 product containing the first SNP. If the amplified product yielded a 
second SNP, then SNP1 and SNP2 were on the same fragment. 

If the SNP is unknown, then four allele-specific adaptors (e.g. C, G, 
A, and T) can be used to hybridize with the target DNA. The substrates 
are then treated with the Fen-ligase protocol, including amplification. The 
20 PCR2 products may be analyzed by PROBE, as described herein, to 

determine which adaptors were hybridized to the DNA target and thus 
identify the SNPs in the sequence. 

A Fen-ligase assay was used to detect two SNPs present in Factor 
VII. These SNPs are located 814 base pairs apart from each other. SNP1 
25 was located at position 8401 (C to T), and SNP2 was located at 9215 (G 
to A) (SEQ ID #). 
A. First Amplification Step 

A PCR product (PCR1) was generated for a known heterozygous 
individual at SNP1 , a short distance from the 5' end of the SNP. 
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Specifically, a 10 jjL PGR reaction was performed by mixing 1.5 mM 
MgCl 2 , 200 /yM of each dNTP, 0.5 U HotStar polymerase, 0.1//M of a 
forward primer having the sequence 5'-GCG CTC CTG TCG GTG CCA 
(SEQ ID NO: 56), 0.1//M of a reverse primer having the sequence 5'-GCC 
5 TGA CTG GTG GGG CCC (SEQ ID NO: 57), and 1 ng of genomic DNA. 
The annealing temperature was 58°C, and the amplification process 
yielded fragments that were 861 bp in length. 

The PCR1 reaction mixture was divided in half and was treated 
with an exonuclease 1/SAP mixture (0.22 jjL mixture/5 jjL PCR1 reaction) 
10 which contained 1.0//L SAP and 0.1 jjL exonl . The exonuclease 
treatment was done for 30 minutes at 37 °C and then 20 minutes at 
85 °C to denature the DNA. 

B. Adaptor Oligonucleotides 

A solution of allele-specific adaptors (C and T), containing of one 
15 long and one short oligonucleotide per adaptor, was prepared. The long 
arm and short arm oligonucleotides of each adaptor (10//M) were mixed in 
a 1 :1 ratio and heated for 30 seconds at 95 °C. The temperature was 
reduced in 2°C increments to 37°C for annealing. The C-adaptor had a 
short arm sequence of 5'-CAT GCA TGC ACG GTC (SEQ ID NO: 58) and 
20 a long arm sequence of 5'-CAG AGA GTA CCC CTC GAC CGT GCA TGC 
ATG (SEQ ID NO: 59). Hence, the long arm of the adaptor was 30 bp 
(15 bp gene-specific), and the short arm was 15bp. The T-adaptor had a 
short arm sequence of 5'-CAT GCA TGC ACG GTT (SEQ ID NO: 60) and 
a long arm sequence of 5'-GTA CGT ACG TGC CAA CTC CCC ATG AGA 
25 GAC (SEQ ID NO: 61). The adaptor could also have a hairpin structure in 
which the short and long arm are separated by a loop containing of 3 to 
10 nucleotides (SEQ ID NO: 118). 

C. FEN-ligase r action 
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In two tubes (one tube for each allele-specific adaptor per sample) 
was placed a solution (Solution A) containing of 3.5 jj\ 10 mM 
16%PEG/50 mM MOPS, 1.2 //I 25 mM MgCI 2/ 1.5 fj\ 10X Ampligase 
Buffer, and 2.5 p\ PCR1 . Each tube containing Solution A was incubated 
5 at 95 °C for 5 minutes to denature the PCR1 product. A second solution 
(Solution B) containing of 1 .65 jj\ Ampligase (Thermostable ligase, 
Epicentre Technologies), 1.65 jjI 200ng///l MFEN (from Methanocuccus 
jannaschii), and 3.0 //I of an allele specific adaptor (C or T) was prepared. 
Thus, different variations of Solution B, each variation containing of 

10 different allele-specific adaptors, were made. Solution B was added to 
Solution A at 95 °C and incubated at 55 °C for 3 hours. The total 
reaction volume was 15.0 //I per adaptor-specific reaction. For a bi-allelic 
system, 2 x 15.0 //I reactions were required. 

The Fen-ligase reaction in each tube was then deactivated by 

15 adding 8.0 pi 10 mM EDTA. Then, 1 .0 jj\ exolll/Buffer (70%/30%) 

solution was added to each sample and incubated 30 minutes at 37 °C, 
20 minutes at 70 °C (to deactivate exolll), and 5 minutes at 95 °C (to 
denature the sample and dissociate unused adaptor from template). The 
samples were cooled in an ice slurry and purified on UltraClean PCR 

20 Clean-up (MoBio) spin columns which removed all fragments less than 
100 base pairs in length. The fragments were eluted with 50 yyl H 2 0. 
D. Second Amplification Step 

A second amplification reaction (PCR2) was conducted in each 
sample tube using the short arm adaptor (C or T) sequence as the forward 

25 primer (minus the SNP1 site). Only the ligated homologue was amplified. 
A standard PCR reaction was conducted with a total volume of 10.0 //I 
containing of IX Buffer (final concentration), 1.5 mM final concentration 
MgCI 2 , 200 //M final concentration dNTPs, 0.5 U HotStar polymerase, 0.1 
jjM final concentration forward primer 5'-CAT GCA TGC ACG GT (SEQ ID 
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NO: 62), 0.1//M final concentration reverse primer 5'-GCC TGA CTG GTG 
GGG CCC (SEQ ID NO: 63), and 1 .0 jj\ of the purified FEN-ligase reaction 
solution. The annealing temperature was 58°C. The PCR2 product was 
analyzed by MALDI TOF mass spectroscopy as described in Example 4. 
5 The mass spectrum of Fen SNP1 showed a mass of 6084.08 Daltons, 
representing the C allele. 
E. Genotyping Additional SNPs 

The second SNP (SNP2) can be found by using an adaptor that is 
specific for SNP2 and hybridizing that adaptor to the PCR2 product 
10 containing the first SNP. The Fen-ligase and amplification procedures are 
repeated for the PCR2 product containing the first SNP. If the amplified 
product yields a second SNP, then SN1 and SN2 are on the same 
fragment. The mass spectrum of SNP2, representing the T allele, 
showed a mass of 6359.88 Daltons. 
15 This assay can also be performed upon pooled DNA to yield 

haplotype frequencies as described herein. The Fen-ligase assay can be 
used to analyze multiplexes as described herein. 

EXAMPLE 8 
Nickase-Mediated Sequence Analysis 
20 A DNA nickase, or DNase, was used to recognize and cleave one strand 
of a DNA duplex. Two nickases usd were NY2A nickase and NYS1 
nickase (Megabase) which cleave DNA at the following sites: 
NY2A: 5\..R AG.. .3' 

3'...Y4TC...5' where R = A or G and Y = C or T 
25 NYS1: 5'... I CC[A/G/T]...3' 

3'... GG[T/C/A]...5\ 
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A. Nickase Digestion 

Tris-HCI (10 mM), KCI {10 mM, pH 8.3), magnesium acetate (25 
mM), BSA (1 mg/mL), and 6 U of Cvi NY2A or Cvi NYS1 Nickase 
(Megabase Research) were added to 25 pmol of double-stranded 
5 oligonucleotide template having a sequence of 5'-CGC AGG GTT TCC 
TCG TCG CAC TGG GCA TGT G-3' (SEQ ID NO: 90, Operon, Alameda, 
CA) synthesized using standard phosphoramidite chemistry . With a total 
volume of 20//L, the reaction mixture was incubated at 37 °C for 5 hours, 
and the digestion products were purified using ZipTips (MHHpore, Bedford, 

10 MA) as described in Example 5. The samples were analyzed by MALDI- 
TOF mass spectroscopy as described in Example 1 . The nickase Cvi 
NY2A yielded three fragments with masses 4049.76 Daltons, 5473.14 
Daltons, and 9540.71 Daltons. The Cvi NYS1 nickase yielded fragments 
with masses 2063.18 Daltons, 3056.48 Daltons, 6492.81 Daltons, and 

15 7450.14 Daltons. 

B. Nickase Digestion of Pooled Samples 

DQA (HLA Classll-DQ Alpha, expected fragment size = 225bp) was 
amplified from the genomic DNA of 1 00 healthy individuals. DQA was 
amplified using standard PCR chemistry in a reaction having a total 

20 volume of 50 /jL containing of 10 mM Tris-HCI, 10 mM KCI (pH 8.3), 2.5 
mM MgCI 2 , 200 pM of each dNTP, 10 pmol of a forward primer having 
the sequence 5'-GTG CTG CAG GTG TAA ACT TGT ACC AG-3'(SEQ ID 
NO: 64), 10 pmol of a reverse primer having the sequence 5'-CAC GGA 
TCC GGT AGC AGC GGT AGA GTT G-3'(SEQ ID NO: 65), 1 U DNA 

25 polymerase (Stoffel fragment, Perkin Elme r), and 200ng human genomic 
DNA (2ng DN A/individual). The template was denatured at 94 °C for 5 
minutes. Thermal cycling was continued with a touch-down program that 
included 45 cycles of 20 seconds at 94°C, 30 seconds at 56 C, 1 
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minute at 72°C, and a final extension of 3 minutes at 72°C. The crude 
PGR product was used in the subsequent nickase reaction. 

The unpurified PCR product was subjected to nickase digestion. 
Tris-HCI (10 mM), KCI (10 mM, pH 8.3), magnesium acetate (25mM), 
5 BSA (1 mg/mL), and 5 U of Cvi NY2A or Cvi NYS.1 Nickase (Megabase 
Research) were added to 25 pmol of the amplified template with a total 
reaction volume of 20//L. The mixture was then incubated at 37 °C for 5 
hours. The digestion products were purified with either ZipTips (Millipore, 
Bedford, MA) as described in Example 5. The samples were analyzed by 

10 MALDI-TOF mass spectroscopy as described in Example 4. This assay 
can also be used to do multiplexing and standardless genotyping as 
described herein. 

To simplify the nickase mass spectrum, the two complementary 
strands can be separated after digestion by using a single-stranded 

15 undigested PCR product as a capture probe. This probe (preparation 

shown below in Example 8C) can be hybridized to the nickase fragments 
in hybridization buffer containing 200 mM sodium citrate and 1 % blocking 
reagent (Boehringer Mannheim). The reaction is heated to 95 °C for 5 
minutes and cooled to room temperature over 30 minutes by using a 

20 thermal cycler (PTC-200 DNA engine, MJ Research, Waltham, MA). The 
capture probe-nickase fragment is immobilized on 140/yg of streptavidin- 
coated magnetic beads. The beads are subsequently washed three times 
with 70 mM ammonium citrate. The captured single-stranded nickase 
fragments are eluted by heating to 80°C for 5 minutes in 5 //L of 50 mM 

25 ammonium hydroxide. 

C. Preparation of Capture Probe 

The capture probe is prepared by amplifying the human jS-globin 
gene (3' end of intron 1 to 5' end of exon 2) via PCR methods in a total 
volume of 50 jjL containing of GeneAmp 1XPCR Buffer II, 10 mM Tris- 
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HCI, pH 8.3, 50 mM KCI, 2 mM MgCI 2/ 0.2 mM dNTP mix, lOpmol of 
each primer (forward primer 5'-ACTGGGCATGTGGAGACAG-3'(SEQ ID 
NO: 66) and biotinylated reverse primer bioB'-GCACTTTCTTGCCATGAG- 
3'(SEQ ID: 67), 2 U of AmpHTaq Gold, and 200 ng of human genomic 
5 DNA. The templates is denatured at 94°C for 8 minutes. Thermal cycling 
is continued with a touch-down program that included 1 1 cycles of 20 
seconds at 94°C / 30 seconds at 64°C, 1 minute at 72°C; and a final 
extension of 5 minutes at 72 °C. The amplicon is purified using 
UltraClean* PCR clean-up kit (MO Bio Laboratories, Solano Beach, CA). 

10 

EXAMPLE 9 

Multiplex Type IIS SNP Assay 

A Type IIS assay was used to identify human gene sequences with 
known SNPs. The Type MS enzyme used in this assay was Fok I which 

15 effected double-stranded cleavage of the target DNA. The assay involved 
the steps of amplification and Fok I treatment of the amplicon. In the 
amplification step, the primers were designed so that each PCR product 
of a designated gene target was less than 1 00 bases such that a Fok I 
recognition sequence was incorporated at the 5' and 3' end of the 

20 amplicon. Therefore, the fragments that were cleaved by Fok I included a 
center fragment containing the SNP of interest. 

Ten human gene targets with known SNPs were analyzed by this 
assay. Sequences of the ten gene targets, as well as the primers used to 
amplify the target regions, are found in Table 5. The ten targets were 

25 lipoprotein lipase, prothrombin, factor V, cholesterol ester transfer protein 
(CETP), factor VII, factor XIII, HLA-H exon 2, HLA-H exon 4, 
methylenetetrahydrofolate reductase (MTHR), and P53 exon 4 codon 72. 

Amplification of the ten human gene sequences were carried out in 
a single 50 jjL volume PCR reaction with 20 ng of human genomic DNA 
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template in 5 PCR reaction tubes. Each reaction vial contained IX PCR 
buffer (Qiagen), 200/yM dNTPs, 1U Hotstar Taq polymerase (Qiagen), 4 
mM MgCI 2 , and lOpmol of each primer. US8, having sequence of 
5'TCAGTCACGACGTT3'(SEQ ID NO: 68), and US9, having sequence of 
5 5'CGGATAACAATTTC3'(SEQ ID NO: 69), were used for the forward and 
reverse primers respectively. Moreover, the primers were designed such 
that a Fok I recognition site was incorporated at the 5' and 3' ends of the 
amplicon. Thermal cycling was performed in 0.2 mL tubes or a 96 well 
plate using a MJ Research Thermal Cycler (calculated temperature) with 
10 the following cycling parameters: 94°C for 5 minutes; 45 cycles: 94°C 
for 20 seconds, 56°C for 20 seconds, 72°C for 60 seconds; and 72°C 
for 3 minutes. 

Following PCR, the sample was treated with 0.2 U Exonuclease I 
(Amersham Pharmacia) and S Alkaline Phosphotase (Amersham 

15 Pharmacia) to remove the unincorporated primers and dNTPs. Typically, 
0.2 U of exonuclease I and SAP were added to 5 //L of the PCR sample. 
The sample was then incubated at 37°C for 15 minutes. Exonuclease I 
and SAP were then inactivated by heating the sample up to 85°C for 15 
minutes. Fok I digestion was performed by adding 2 U of Fok I (New 

20 England Biolab) to the 5 uL PCR sample and incubating at 37 °C for 30 
minutes. Since the Fok I restriction sites are located on both sides of the 
amplicon, the 5' and 3' cutoff fragments have higher masses than the 
center fragment containing the SNP. The sample was then purified by 
anion exchange and analyzed by MALDI-TOF mass spectrometry as 

25 described in Example 4. The masses of the gene fragments from this 

multiplexing experiment are listed in Table 6. These gene fragments were 
resolved in mass spectra thereby allowing multiplex analysis of sequence 
variability in these genes. 

Table 5 

30 Genes for Multiplex Type US Assay 
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Gene 


Sequence 


Seq. ID 
No. 


Primers 


Seq. 
ID No. 


Lipoprotein 
Lipase 

(Asn291Ser) 


cctttgagaa agggctctgc ttgagttgta 
gaaagaaccg ctgcaacaat 
ctqqqctatq agatca[a»»q]taa aqtcaqaqcc 
aaaaqaaqca gcaaaatata 


98-99 


5' 

caatttcatcgctggatgcaatct 

gggctatgagatc 3' 

5' 

caatttcacacagcggatgcttct 
tnggctctgact3' 


70 
71 


Prothrombin 


26731 gaanatttttgtgmctaaaactatggt 
tcccaataaa aqtqactctc 
26781 aqc[g*alagcctc aatgctccca 
ptqctanca tqaacagctc tctqggctca 


100- 
101 


5' 

tcaqtcacqacqttqgatqccaa 
taaaaqtqactctcaqc 3' 
5' 

/'nnataaraatftrnnatneaH 

qqqaqcattqaqqc 3' 


72 
73 


Factor V 
iMrgDuouinj 


taataggact acttctaatc tgtaagagca 
qatccctqqa caqqciq*»a]aqqa 


102- 

1 


5' 

tcagtcacgacqttggatgaofca 
gatccctqqacaqqc 3' 


74 




atacaqqtat tttqrtccttq aagtaacctt tcaa 




5' 

cggataacaatttcqgatqqaca 
aaatacctqtattcc 3' 


75 


Cholesterol ester 
transfer protein 
(CETP) (1405V) 


1261 ctcaccatgg gcamgatt gcaqagcaqe 
tccgaqtccfg *- a] tccaqaqctt 

1311 cctqcaqtca atqatcaccq ctqtgggcat 
ccctgaggtc atgtctcgta 


104- 
1 US 


5' 

tcagtcacgacqttggatgcaga 
qcaqctccqaqtc 3' 

5' 

caqcqqtqatcattqqatqcaqq 


76 
77 








aacjctctgg 3' 




Factor VII 


1221 agcaaggact cctgcaaggg ggacagtgga 


106- 
1 07 


5' 

icoBicautjacQicqciaipccwa 


78 




1271 ccla*>g]gggcacg tqqtacctqa 
cqqqcatcqt caqctgqaoc caaaqctocq 




catqccacccactac 3' 

5' 

cggataacaatttcqqatqcccg 
tcaqqtaccacq 3' 


79 


Factor XIII 
(V34L) 


1 1 1 caataactct aatgcagcgg aagatgacct 
qcccacaqtq qaqcttcaqq 


108- 
109 


5' 


80 




161 gclg»t)tggtgcc ccggggcgtc 
aacctacaaq qtatqaqcat accccccttc 




caqtqqaqcttcaq 3' 
5' 

qctcataccttqcaqqatqacq 
3* 


81 


HLA-H exon 2 

(Hts63Asp) 


361 ttgaagcTttqggctacatg qatqaccaqc 
tgttcqtqttctatgat[Cft>q]at 


110- 
111 


5' 

rcagtcacgacgttgqatqacca 


82 




411 qaqaqtcqcc qtqtqqaqcc ccqaactcca 
tflggrttcca gtagaamc 




qctqttcqtqttc3' 

5' 

tacatgqaqttcqqqqatqcaca 
cgqcgactctc 3' 


83 


HLA-H exon 4 
(Cys282Tyr> 


1 02 1 ggataacctt ggctgtaccc cctqqqqaaq 
aqcaqaqata tacgt[g»»a]ccag 


112- 
113 


5' 

tcagtcacgacgttggatgggqa 


84 








aqaqcaqaqatatacqt 3' 






1071 qtqgaqcacc caqqcctqqa tcaqcccctc 
attgtgatct gggagccctc 




5' 

qaqqqqctqatccaqqatqqqt 


85 








qctccac 3' 
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Gene 


Cani isnro 


Seq. ID 
No. 




Seq. 
ID No. 




Methylentetrahy 
drofo late red etas 


761 tqaaqcactt qaaqqa qaaq qtqtctqcqq 
qagfeMlcqattt catcatcacq 


1 14- 
115 


5* 

tcaatcacaacqttqqatqqqqa 


86 




e (MTHR) 
(Ala222Val) 


811 cagcttttct ttgaqgctga cacattcttc 




aqaqcaqaqatatacqt 3' 
5' 

qaqqqqctqatccaqqatqqqt 
qctccac 3' 


87 


5 


P63 Exon4 
Codon 72 
(Arg72Pro> 


12101 tccaqatqaa qctcccaqaa 
tqccagaqqc tgctcccc[g»c]c gtggcccctg 

12151 caccaqcaqc tcctacaccq 
qcqqcccctQ 


116- 
117 


5' 

qatqaaqctcccaqqatqccaq 
agqc 3' 

5' 

qccqccqqtqtaqqatqctqctq 
gtgc 3' 


88 
89 
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Table 6 

The mass of Center Fragments for Ten Different SNP Typing by 

IIS Assay 
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EXAMPLE 10 

Exemplary use of parental medical history parameter for stratification of healthy 
datebase 

A healthy database can be used to associate a disease state with a 
specific allele (SNP) that has been found to show a strong association between 
age and the allele, in particular the homozygous genotype. The method involves 
using the same healthy database used to identify the age dependent association, 
however stratification is by information given by the donors about common 
disorders from which their parents suffered {the donor's familial history of 
disease). There are three possible answers a donor could give about the health 
status of their parents: neither were affected, one was affected or both were 
affected. Only donors above a certain minimum age, depending on the disease, 
are utilized, as the donors parents must be old enough to to have exhibited 
clinical disease phenotypes. The genotype frequency in each of these groups is 
determined and compared with each other. If there is an association of the 
marker in the donor to a disease the frequency of the heterozyous genotype will 
be increased. The frequency of the homozygous genotype should not increase, 
as it should be significantly underrepresented in the healthy popuJation. 
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EXAMPLE 1 1 

Method and Device for Identifying a Biological Sample 
Description 

In accordance with the present invention, a method and device for 
5 identifying a biological sample is provided. Referring now to FIG. 24, an 

apparatus 10 for identifying a biological sample is disclosed. The apparatus 10 
for identifying a biological sample generally comprises a mass spectrometer 15 
communicating with a computing device 20. In a preferred embodiment, the 
mass spectrometer may be a MALDI-TOF mass spectrometer manufactured by 
10 Bruker-Franzen Analytik GmbH; however, it will be appreciated that other mass 
spectrometers can be substituted. The computing device 20 is preferably a 
general purpose computing device. However, it will be appreciated that the 
computing device could be alternatively configured, for example, it may be 
integrated with the mass spectrometer or could be part of a computer in a larger 
1 5 network system. 

The apparatus 10 for identifying a biological sample may operate as an 
automated identification system having a robot 25 with a robotic arm 27 
configured to deliver a sample plate 29 into a receiving area 31 of the mass 
spectrometer 15. In such a manner, the sample to be identified may be placed 
20 on the plate 29 and automatically received into the mass spectrometer 15. The 
biological sample is then processed in the mass spectrometer to generate data 
indicative of the mass of DNA fragments in the biological sample. This data may 
be sent directly to computing device 20, or may have some preprocessing or 
filtering performed within the mass spectrometer. In a preferred embodiment, 
25 the mass spectrometer 1 5 transmits unprocessed and unf iltered mass 

spectrometry data to the computing device 20. However, it will be appreciated 
that the analysis in the computing device may be adjusted to accommodate 
preprocessing or filtering performed within the mass spectrometer. 

Referring now to FIG. 25, a general method 35 for identifying a biological 
30 sample is shown. In method 35, data is received into a computing device from a 
test instrument in block 40. Preferably the data is received in a raw, 
unprocessed and unfilt red form, but alternatively may have some form of 
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filtering or processing applied. The test instrument of a preferred embodiment is 
a mass spectrometer as described above. However, it will be appreciated that 
other test instruments could be substituted for the mass spectrometer. 

The data generated by the test instrument, and in particular the mass 
5 spectrometer, includes information indicative of the identification of the 
biological sample. More specifically, the data is indicative of the DNA 
composition of the biological sample. Typically, mass spectrometry data 
gathered from DNA samples obtained from DNA amplification techniques are 
noisier than, for example, those from typical protein samples. This is due in part 

10 because protein samples are more readiiy prepared in more abundance, and 
protein samples are more easily ionizable as compared to DNA samples. 
Accordingly, conventional mass spectrometer data analysis techniques are 
generally ineffective for DNA analysis of a biological sample. To improve the 
analysis capability so that DNA composition data can be more readily discerned, 

15 a preferred embodiment uses wavelet technology for analyzing the DNA mass 
spectrometry data. Wavelets are an analytical tool for signal processing, 
numerical analysis, and mathematical modeling. Wavelet technology provides a 
basic expansion function which is applied to a data set. Using wavelet 
decomposition, the data set can be simultaneously analyzed in the time and 

20 frequency domains. Wavelet transformation is the technique of choice in the 
analysis of data that exhibit complicated time (mass) and frequency domain 
information, such as MALDI-TOF DNA data. Wavelet transforms as described 
herein have superior denoising properties as compared to conventional Fourier 
analysis techniques. Wavelet transformation has proven to be particularly 

25 effective in interpreting the inherently noisy MALDI-TOF spectra of DNA 

samples. In using wavelets, a "small wave" or "scaling function" is used to 
transform a data set into stages, with each stage representing a frequency 
component in the data set. Using wavelet transformation, mass spectrometry 
data can be processed, filtered, and analyzed with sufficient discrimination to be 

30 useful for identification of the DNA composition for a biological sample. 

Referring again to FIG. 25, the data received in block 40 is denoised in 
block 45. The denoised data then has a baseline correction applied in block 50. 
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A baseline correction is generally necessary as data coming from the test 
instrument, in particular a mass spectrometer instrument, has data arranged in a 
generally exponentially decaying manner. This generally exponential decaying 
arrangement is not due to the composition of the biological sample, but is a 
5 result of the physical properties and characteristics of the test instrument, and 
other chemicals involved in DNA sample preparation. Accordingly, baseline 
correction substantially corrects the data to remove a component of the data 
attributable to the test system, and sample preparation characteristics. 

After denoising in block 45 and the baseline correction in block 50, a 
10 signal remains which is generally indicative of the composition of the biological 
sample. However, due to the extraordinary discrimination required for analyzing 
the DNA composition of the biological sample, the composition is not readily 
apparent from the denoised and corrected signal. For example, although the 
signal may include peak areas, it is not yet clear whether these "putative" peaks 

1 5 actually represent a DNA composition, or whether the putative peaks are result 
of a systemic or chemical aberration. Further, any call of the composition of the 
biological sample would have a probability of error which would be unacceptable 
for clinical or therapeutic purposes. In such critical situations, there needs to be 
a high degree of certainty that any call or identification of the sample is 

20 accurate. Therefore, additional data processing and interpretation is necessary 
before the sample can be accurately and confidently identified. 

Since the quantity of data resulting from each mass spectrometry test is 
typically thousands of data points, and an automated system may be set to 
perform hundreds or even thousands of tests per hour, the quantity of mass 

25 spectrometry data generated is enormous. To facilitate efficient transmission 
and storage of the mass spectrometry data, block 55 shows that the denoised 
and baseline corrected data is compressed. 

In a preferred embodiment, the biological sample is selected and 
processed to have only a limited range of possible compositions. Accordingly, it 

30 is therefore known where peaks indicating composition should be located, if 

present. Taking advantage of knowing the location of these expect d peaks, in 
block 60 the method 35 matches putative peaks in the processed signal to the 
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location of the expected peaks. In such a manner, the probability of each 
putative peak in the data being an actual peak indicative of the composition of 
the biological sample can be determined. Once the probability of each peak is 
determined in block 60, then in block 65 the method 35 statistically determines 
5 the composition of the biological sample, and determines if confidence is high 
enough to calling a genotype. 

Referring again to block 40, data is received from the test instrument, 
which is preferably a mass spectrometer. In a specific illustration, FIG. 26 
shows an example of data from a mass spectrometer. The mass spectrometer 
10 data 70 generally comprises data points distributed along an x-axis 71 and a y- 
axis 72. The x-axis 71 represents the mass of particles detected, while the y- 
axis 72 represents a numerical concentration of the particles. As can be seen in 
FIG. 26, the mass spectrometry data 70 is generally exponentially decaying with 
data at the left end of the x-axis 73 generally decaying in an exponential manner 

15 toward data at the heavier end 74 of the x-axis 71. However, the general 

exponential presentation of the data is not indicative of the composition of the 
biological sample, but is more reflective of systematic error and characteristics. 
Further, as described above and illustrated in FIG. 26, considerable noise exists 
in the mass spectrometry DNA data 70. 

20 Referring again to block 45, where the raw data received in block 40 is 

denotsed, the denoising process will be described in more detail. As illustrated 
in FIG. 25, the denoising process generally entails 1) performing a wavelet 
transformation on the raw data to decompose the raw data into wavelet stage 
coefficients; 2) generating a noise profile from the highest stage of wavelet 

25 coefficients; and 3) applying a scaled noise profile to other stages in the wavelet 
transformation. Each step of the denoising process is further described below. 

Referring now to FIG. 27, the wavelet transformation of the raw mass 
spectrometry data is generally diagramed. Using wavelet transformation 
techniques, the mass spectrometry data 70 is sequentially transformed into 

30 stages. In each stage the data is represented in a high stage and a low stage, 
with the low stage acting as the input to the next sequential stage. For 
example, the mass spectrometry data 70 is transformed into stage 0 high data 
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82 and stage 0 low data 83. The stage 0 low data 83 is then used as an input 
to the next level transformation to generate stage 1 high data 84 and stage 1 
low data 85- In a similar manner, the stage 1 low data 85 is used as an input to 
be transformed into stage 2 high data 86 and stage 2 low data 87. The 
5 transformation is continued until no more useful information can be derived by 
further wavelet transformation. For example, in the preferred embodiment a 24- 
point wavelet is used. More particularly a wavelet commonly referred to as the 
Daubechies 24 is used to decompose the raw data. However, it will be 
appreciated that other wavelets can be used for the wavelet transformation. 

1 0 Since each stage in a wavelet transformation has one-half the data points of the 
previous stage, the wavelet transformation can be continued until the stage n 
low data 89 has around 50 points. Accordingly, the stage n high 88 would 
contain about 100 data points. Since the preferred wavelet is 24 points long, 
little data or information can be derived by continuing the wavelet transformation 

15 on a data set of around 50 points. 

FIG. 28 shows an example of stage 0 high data 95. Since stage 0 high 
data 95 is generally indicative of the highest frequencies in the mass 
spectrometry data, stage O high data 95 will closely relate to the quantity of 
high frequency noise in the mass spectrometry data. In FIG. 29, an exponential 

20 fitting formula has been applied to the stage 0 high data 95 to generate a stage 
0 noise profile 97. In particular, the exponential fitting formula is in the format 
A 0 + A, EXP (-A 2 m). It will be appreciated that other exponential fitting 
formulas or other types of curve fits may be used. 

Referring now to FIG. 30, noise profiles for the other high stages are 

25 determined. Since the later data points in each stage will likely be representative 
of the level of noise in each stage, only the later data points in each stage are 
used to generate a standard deviation figure that is representative of the noise 
content in that particular stage. More particularly, in generating the noise profile 
for each remaining stage, only the last five percent of the data points in each 

30 stage ar analyz d to determin d a standard deviation number. It will be 

appreciated that other numbers f points, or alternative methods could be us d 
to generate such a standard d viation figure. 
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The standard deviation number for each stage is used with the stage 0 
noise profile (the exponential curve) 97 to generate a scaled noise profile for 
each stage. For example, FIG. 30 shows that stage 1 high data 98 has stage 1 
high data 103 with the last five percent of the data points represented by area 
5 99. The points in area 99 are evaluated to determine a standard deviation 

number indicative of the noise content in stage 1 high data 103. The standard 
deviation number is then used with the stage 0 noise profile 97 to generate a 
stage 1 noise profile. 

In a similar manner, stage 2 high 100 has stage 2 high data 104 with the 
10 last five percent of points represented by area 101. The data points in area 101 
are then used to calculate a standard deviation number which is then used to 
scale the stage 0 noise profile 97 to generate a noise profile for stage 2 data. 
This same process is continued for each of the stage high data as shown by the 
stage n high 105. For stage n high 105, stage n high data 108 has the last five 
15 percent of data points indicated in area 106. The data points in area 106 are 
used to determine a standard deviation number for stage n. The stage n 
standard deviation number is then used with the stage 0 noise profile 97 to 
generate a noise profile for stage n. Accordingly, each of the high data stages 
has a noise profile. 

20 FIG. 31 shows how the noise profile is applied to the data in each stage. 

Generally, the noise profile is used to generate a threshold which is applied to 
the data in each stage. Since the noise profile is already scaled to adjust for the 
noise content of each stage, calculating a threshold permits further adjustment 
to tune the quantity of noise removed. Wavelet coefficients below the threshold 

25 are ignored while those above the threshold are retained. Accordingly, the 
remaining data has a substantial portion of the noise content removed. 

Due to the characteristics of wavelet transformation, the lower stages, 
such as stage 0 and 1 , will have more noise content than the later stages such 
as stage 2 or stage n. Indeed, stage n low data is likely to have little noise at 

30 all. Therefore, in a preferred embodiment the noise profiles are applied more 
aggressively in the lower stages and less aggressively in the later stages. For 
example, FIG. 31 shows that stage 0 high threshold is determined by multiplying 
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the stage 0 noise profile by a factor of four. In such a manner, significant 
numbers of data points in stage 0 high data 95 will be below the threshold and 
therefore eliminated. Stage 1 high threshold 1 1 2 is set at two times the noise 
profile for the stage 1 high data, and stage 2 high threshold 1 1 4 is set equal to 
5 the noise profile for stage 2 high. Following this geometric progression, stage n 
high threshold 1 1 6 is therefore determined by scaling the noise profile for each 
respective stage n high by a factor equal to (1/2 n2 ). It will be appreciated that 
other factors may be applied to scale the noise profile for each stage. For 
example, the noise profile may be scaled more or less aggressively to 

10 accommodate specific systemic characteristics or sample compositions. As 

indicated above, stage n low data does not have a noise profile applied as stage 
n low data 1 1 8 is assumed to have little or no noise content. After the scaled 
noise profiles have been applied to each high data stage, the mass spectrometry 
data 70 has been denoised and is ready for further processing. A wavelet 

1 5 transformation of the denoised signal results in the sparse data set 1 20 as 
shown in FIG. 31 . 

Referring again to FIG. 25, the mass spectrometry data received in block 
40 has been denoised in block 45 and is now passed to block 50 for baseline 
correction. Before performing baseline correction, the artifacts introduced by the 

20 wavelet transformation procedure are preferably removed. Wavelet 

transformation results vary slightly depending upon which point of the wavelet is 
used as a starting point. For example, the preferred embodiment uses the 24- 
point Daubechies-24 wavelet. By starting the transformation at the 0 point of 
the wavelet, a slightly different result will be obtained than if starting at points 1 

25 or 2 of the wavelet. Therefore, the denoised data is transformed using every 
available possible starting point, with the results averaged to determine a final 
denoised and shifted signal. For example, FIG. 33 shows that the wavelet 
coefficient is applied 24 different times and then the results averaged to 
generate the final data set. It will be appreciated that other techniques may be 

30 used to accommodate the slight error introduced due to wavelet shifting. 

The formula 125 is generally indicated in FIG. 33. Once the signal has 
been denoised and shifted, a denoised and shifted signal 130 is generated as 
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shown in FIG. 58. FIG. 34 shows an example of the wavelet coefficient 135 
data set from the denoised and shifted signal 130. 

FIG. 36 shows that putative peak areas 145, 147, and 149 are located in 
the denoised and shifted signal 1 50. The putative peak areas are systematically 
5 identified by taking a moving average along the signal 1 50 and identifying 
sections of the signal 1 50 which exceed a threshold related to the moving 
average. It will be appreciated that other methods can be used to identify 
putative peak areas in the signal 1 50. 

Putative peak areas 145, 147 and 149 are removed from the signal 150 
10 to create a peak-free signal 155 as shown in FIG. 37. The peak-free signal 155 
is further analyzed to identify remaining minimum values 157, and the remaining 
minimum values 157 are connected to generate the peak-free signal 155. 

FIG. 38 shows a process of using the peak-free signal 155 to generate a 
baseline 170 as shown in FIG. 39. As shown in block 162, a wavelet 
15 transformation is performed on the peak-free signal 155. All the stages from the 
wavelet transformation are eliminated in block 1 64 except for the n low stage. 
The n low stage will generally indicate the lowest frequency component of the 
peak-free signal 155 and therefore will generally indicate the system exponential 
characteristics. Block 166 shows that a signal is reconstructed from the n low 
20 coefficients and the baseline signal 170 is generated in block 168. 

FIG. 39 shows a denoised and shifted data signal 172 positioned adjacent 
a correction baseline 170. The baseline correction 170 is subtracted from the 
denoised and shifted signal 172 to generate a signal 175 having a baseline 
correction applied as shown in FIG. 40. Although such a denoised, shifted, and 
25 corrected signal is sufficient for most identification purposes, the putative peaks 
in signal 175 are not identifiable with sufficient accuracy or confidence to call 
the DNA composition of a biological sample. 

Referring again to FIG. 25, the data from the baseline correction 50 is 
now compressed in block 55, the compression technique used in a preferred 
30 embodiment is detailed in FIG. 41. In FIG. 41 the data in the baseline corrected 
data is presented in an array format 182 with x-axis points 183 having an 
associated data value 1 84. The x-axis is indexed by the non-zero wavelet 
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coefficients, and the associated value is the value of the wavelet coefficient. In 
the illustrated data example in table 182, the maximum value 184 is indicated to 
be 1000. Although a particularly advantageous compression technique for mass 
spectrometry data is shown, it will be appreciated that other compression 
5 techniques can be used. Although not preferred, the data may also be stored 
without compression. 

In compressing the data according to a preferred embodiment, an 
intermediate format 186 is generated. The intermediate format 186 generally 
comprises a real number having a whole number portion 1 88 and a decimal 

10 portion 190. The whole number portion is the x-axis point 183 while the 

decimal portion is the value data 184 divided by the maximum data value. For 
example, in the data 182 a data value "25" is indicated at x-axis point "100". 
The intermediate value for this data point would be "100.025". 

From the intermediate compressed data 1 86 the final compressed data 

15 195 is generated. The first point of the intermediate data file becomes the 
starting point for the compressed data. Thereafter each data point in the 
compressed data 195 is calculated as follows: the whole number portion (left of 
the decimal) is replaced by the difference between the current and the last whole 
number. The remainder (right of the decimal) remains intact. For example, the 

20 starting point of the compressed data 1 95 is shown to be the same as the 

intermediate data point which is "100.025". The comparison between the first 
intermediate data point "100.025" and the second intermediate data point 
"1 50.220" is "50.220". Therefore, "50.220" becomes the second point of the 
compressed data 195. In a similar manner, the second intermediate point is 

25 "150.220" and the third intermediate data point is "500.0001". Therefore, the 
third compressed data becomes "350.000". The calculation for determining 
compressed data points is continued until the entire array of data points is 
converted to a single array of real numbers. 

FIG. 42 generally describes the method of compressing mass 

30 spectrometry data, showing that the data file in block 201 is presented as an 
array of coefficients in block 202. The data starting point and maximum is 
det rmined as shown in block 203, and the intermediate real numbers are 



\'SDOC!D <WO 01279S7A? IA> 



WO 01/027857 



101 



PCT/US00/28413 



calculated in block 204 as described above. With the intermediate data points 
generated, the compressed data is generated in block 205. The described 
compression method is highly advantageous and efficient for compressing data 
sets such as a processed data set from a mass spectrometry instrument. The 
5 method is particularly useful for data, such as mass spectrometry data, that uses 
large numbers and has been processed to have occasional lengthy gaps in x-axis 
data. Accordingly, an x-y data array for processed mass spectrometry data may 
be stored with an effective compression rate of 10x or more. Although the 
compression technique is applied to mass spectrometry data, it will be 
10 appreciated that the method may also advantageously be applied to other data 
sets. 

Referring again to FIG. 25, peak heights are now determined in block 60. 
The first step in determining peak height is illustrated in FIG. 43 where the signal 
210 is shifted left or right to correspond with the position of expected peaks. 

15 As the set of possible compositions in the biological sample is known before the 
mass spectrometry data is generated, the possible positioning of expected peaks 
is already known. These possible peaks are referred to as expected peaks, such 
as expected peaks 212, 214, and 216. Due to calibration or other errors in the 
test instrument data, the entire signal may be shifted left or right from its actual 

20 position, therefore, putative peaks located in the signal, such as putative peaks 
218, 222, and 224 may be compared to the expected peaks 212, 214, and 216, 
respectively. The entire signal is then shifted such that the putative peaks align 
more closely with the expected peaks. 

Once the putative peaks have been shifted to match expected peaks, the 

25 strongest putative peak is identified in FIG. 44. In a preferred embodiment, the 
strongest peak is calculated as a combination of analyzing the overall peak 
height and area beneath the peak. For example, a moderately high but wide 
peak would be stronger than a very high peak that is extremely narrow. With 
the strongest putative peak identified, such as putative peak 225, a Gaussian 

30 228 curve is fit to the peak 225. Once the Gaussian is fit, the width (W) of the 
Gaussian is determined and will be used as the peak width for future 
calculations. 
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As generally addressed above, the denoised, shifted, and baseline- 
corrected signal is not sufficiently processed for confidently calling the DNA 
composition of the biological sample. For example, although the baseline has 
generally been removed, there are still residual baseline effects present. These 
5 residual baseline effects are therefore removed to increase the accuracy and 
confidence in making identifications. 

To remove the residual baseline effects, FIG. 45 shows that the putative 
peaks 218, 222, and 224 are removed from the baseline corrected signal. The 
peaks are removed by identifying a center line 230, 232, and 234 of the 
10 putative peaks 218, 222, and 224, respectively and removing an area to the left 
and to the right of the identified center line. For each putative peak, an area 
equal to twice the width (W) of the Gaussian is removed from the left of the 
center line, while an area equivalent to 50 daltons is removed from the right of 
the center line. It has been found that the area representing 50 daltons is 
1 5 adequate to sufficiently remove the effect of salt adducts which may be 

associated with an actual peak. Such adducts appear to the right of an actual 
peak and are a natural effect from the chemistry involved in acquiring a mass 
spectrum. Although a 50 Dalton buffer has been selected, it will be appreciated 
that other ranges or methods can be used to reduce or eliminate adduct effects. 
20 The peaks are removed and remaining minima 247 located as shown in 

FIG. 46 with the minima 247 connected to create signal 245. A quartic 
polynomial is applied to signal 245 to generate a residual baseline 250 as shown 
in FIG. 47. The residual baseline 250 is subtracted from the signal 225 to 
generate the final signal 255 as indicated in FIG. 48. Although the residual 
25 baseline is the result of a quartic fit to signal 245, it will be appreciated that 
other techniques can be used to smooth or fit the residual baseline. 

To determine peak height, as shown in FIG. 49, a Gaussian such as 
Gaussian 266, 268, and 270 is fit to each of the peaks, such as peaks 260, 
262, and 264, respectively. Accordingly, the height of the Gaussian is 
30 determined as height 272, 274, and 276. Once the height of each Gaussian 

peak is determined, then the m thod of identifying a biological compound 35 can 
move into the genotyping phase 65 as shown in FIG. 25. 
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An indication of the confidence that each putative peak is an actual peak 
can be discerned by calculating a signal-to-noise ratio for each putative peak. 
Accordingly, putative peaks with a strong signal-to-noise ratio are generally more 
likely to be an actual peak than a putative peak with a lower signal-to-noise 
5 ratio. As described above and shown in FIG. 50, the height of each peak, such 
as height 272, 274, and 276, is determined for each peak, with the height being 
an indicator of signal strength for each peak. The noise profile, such as noise 
profile 97, is extrapolated into noise profile 280 across the identified peaks. At 
the center line of each of the peaks, a noise value is determined, such as noise 
10 value 282, 283, and 284. With a signal values and a noise values generated, 
signal-to-noise ratios can be calculated for each peak. For example, the signal- 
to-noise ratio for the first peak in FIG. 50 would be calculated as signal value 
272 divided by noise value 282, and in a similar manner the signal-to-noise ratio 
of the middle peak in FIG. 50 would be determined as signal 274 divided by 
1 5 noise value 283. 

Although the signal-to-noise ratio is generally a useful indicator of the 
presence of an actual peak, further processing has been found to increase the 
confidence by which a sample can be identified. For example, the signal-to- 
noise ratio for each peak in the preferred embodiment is preferably adjusted by 
20 the goodness of fit between a Gaussian and each putative peak. It is a 

characteristic of a mass spectrometer that sample material is detected in a 
manner that generally complies with a normal distribution. Accordingly, greater 
confidence will be associated with a putative signal having a Gaussian shape 
than a signal that has a less normal distribution. The error resulting from having 
25 a non-Gaussian shape can be referred to as a "residual error". 

Referring to FIG. 51, a residual error is calculated by taking a root mean 
square calculation between the Gaussian 293 and the putative peak 290 in the 
data signal. The calculation is performed on data within one width on either side 
of a center line of the Gaussian. The residual error is calculated as: VTG-R^/N 
30 where G is th Gaussian signal value, R is the putative peak value, and N 

is the numb r of points from -W to + W. The calculated residual error is 
used to generate an adjusted signal-to-noise ratio, as describ d below. 

RECTIFIED SHEET (RULE 91 ) ISA/EP 
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An adjusted signal noise ratio is calculated for each putative peak using 
the formula (S/N) * EXP* - - 1 * R> , where S/N is the signal-to-noise ratio, and Ft is 
the residual error determined above. Although the preferred embodiment 
calculates an adjusted signal-to-noise ratio using a residual error for each peak, it 
5 will be appreciated that other techniques can be used to account for the 
goodness of fit between the Gaussian and the actual signal. 

Referring now to FIG. 52, a probability is determined that a putative peak 
is an actual peak. In making the determination of peak probability, a probability 
profile 300 is generated where the adjusted signal-to-noise ratio is the x-axis and 

10 the probability is the y-axis. Probability is necessarily in the range between a 
0% probability and a 100% probability, which is indicated as 1. Generally, the 
higher the adjusted signal-to-noise ratio, the greater the confidence that a 
putative peak is an actual peak. 

At some target value for the adjusted signal-to-noise, it has been found 

1 5 that the probability is 100% that the putative peak is an actual peak and can 
confidently be used to identify the DNA composition of a biological sample. 
However, the target value of adjusted signal-to-noise ratio where the probability 
is assumed to be 100% is a variable parameter which is to be set according to 
application specific criteria. For example, the target signal-to-noise ratio will be 

20 adjusted depending upon trial experience, sample characteristics, and the 

acceptable error tolerance in the overall system. More specifically, for situations 
requiring a conservative approach where error cannot be tolerated, the target 
adjusted signal-to-noise ratio can be set to, for example, 10 and higher. 
Accordingly, 100% probability will not be assigned to a peak unless the adjusted 

25 signal-to-noise ratio is 10 or over. 

In other situations, a more aggressive approach may be taken as sample 
data is more pronounced or the risk of error may be reduced. In such a 
situation, the system may be set to assume a 100% probability with a 5 or 
greater target signal-to-noise ratio. Of course, an intermediate signal-to-noise 

30 ratio target figure can be selected, such as 7, when a moderate risk of error can 
be assumed. Once the target adjusted signal-to-noise ratio is set for the method, 
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then for any adjusted signal-to-noise ratio a probability can be determined that a 
putative peak is an actual peak. 

Due to the chemistry involved in performing an identification test, 
especially a mass spectrometry test of a sample prepared by DNA amplifications, 
5 the allelic ratio between the signal strength of the highest peak and the signal 
strength of the second (or third and so on) highest peak should fall within an 
expected ratio. If the allelic ratio falls outside of normal guidelines, the preferred 
embodiment imposes an allelic ratio penalty to the probability. For example, 
FIG. 53 shows an allelic penalty 315 which has an x-axis 317 that is the ratio 
1 0 between the signal strength of the second highest peak divided by signal 

strength of the highest peak. The y-axis 319 assigns a penalty between 0 and 1 
depending on the determined allelic ratio. In the preferred embodiment, it is 
assumed that allelic ratios over 30% are within the expected range and therefore 
no penalty is applied. Between a ratio of 10% and 30%, the penalty is linearly 

15 increased until at allelic ratios below 10% it is assumed the second-highest peak 
is not real. For allelic ratios between 10% and 30%, the allelic penalty chart 
315 is used to determine a penalty 319, which is multiplied by the peak 
probability determined in FIG. 52 to determine a final peak probability. Although 
the preferred embodiment incorporates an allelic ratio penalty to account for a 

20 possible chemistry error, it will be appreciated that other techniques may be 
used. Similar treatment will be applied to the other peaks. 

With the peak probability of each peak determined, the statistical 
probability for various composition components may be determined. As an 
example, in order to determine the probability of each of three possible 

25 combinations of two peaks, - peak G, peak C and combinations GG, CC and 

GC. FIG. 54 shows an example where a most probable peak 325 is determined 
to have a final peak probability of 90%. Peak 325 is positioned such that it 
represents a G component in the biological sample. Accordingly, it can be 
maintained that there is a 90% probability that G exists in the biological sample. 

30 Also in the example shown in FIG. 54, the second highest probability is peak 
330 which has a peak probability of 20%. Peak 330 is at a position associated 
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with a C composition. Accordingly, it can be maintained that there is a 20% 
probability that C exists in the biological sample. 

With the probability of G existing (90%) and the probability of C existing 
(20%J as a starting point, the probability of combinations of G and C existing 
5 can be calculated. For example, FIG. 54 indicates that the probability of GG 
existing 329 is calculated as 72%. This is calculated as the probability of GG is 
equal to the probability of G existing (90%) multiplied by the probability of C not 
existing (100% -20%). So if the probability of G existing is 90% and the 
probability of C not existing is 80%, the probability of GG is 72%. 
10 In a similar manner, the probability of CC existing is equivalent to the 

probability of C existing (20%) multiplied by the probability of G not existing 
(100% - 90%). As shown in FIG. 54, the probability of C existing is 20% while 
the probability of G not existing is 10%, so therefore the probability of CC is 
only 2%. Finally, the probability of GC existing is equal to the probability of G 
15 existing (90%) multiplied by the probability of C existing (20%). So if the 
probability of G existing is 90% and the probability of C existing is 20%, the 
probability of GC existing is 18%. In summary form, then, the probability of the 
composition of the biological sample is: 
probability of GG: 72%; 
20 probability of GC: 18%; and 

probability of CC: 2%. 
Once the probabilities of each of the possible combinations has been 
determined, FIG. 55 is used to decide whether or not sufficient confidence exists 
to call the genotype. FIG. 55 shows a call chart 335 which has an x-axis 337 
25 which is the ratio of the highest combination probability to the second highest 
combination probability. The y-axis 339 simply indicates whether the ratio is 
sufficiently high to justify calling the genotype. The value of the ratio may be 
indicated by M 340. The value of M is set depending upon trial data, sample 
composition, and the ability to accept error. For example, the value M may be 
30 set relatively high, such as to a value 4 so that the highest probability must be at 
least four times greater than the second highest probability before confidence is 
stablished to call a genotype. However, if a c rtain level of error may be 
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acceptable, the value of M may be set to a more aggressive value, such as to 3, 
so that the ratio between the highest and second highest probabilities needs to 
be only a ratio of 3 or higher. Of course, moderate value may be selected for M 
when a moderate risk can be accepted. Using the example of FIG. 54, where 
5 the probability of GG was 72% and the probability of GC was 18%, the ratio 
between 72% and 18% is 4.0, therefore, whether M is set to 3, 3.5, or 4, the 
system would call the genotype as GG. Although the preferred embodiment 
uses a ratio between the two highest peak probabilities to determine if a 
genotype confidently can be called, it will be appreciated that other methods 

10 may be substituted. It will also be appreciated that the above techniques may 
be used for calculating probabilities and choosing genotypes (or more general 
DIMA patterns) containing of combinations of more than two peaks. 

Referring now to FIG. 56, a flow chart is shown generally defining the 
process of statistically calling genotype described above. In FIG. 56 block 402 

15 shows that the height of each peak is determined and that in block 404 a noise 
profile is extrapolated for each peak. The signal is determined from the height of 
each peak in block 406 and the noise for each peak is determined using the 
noise profile in block 408. In block 410, the signal-to-noise ratio is calculated 
for each peak. To account for a non-Gaussian peak shape, a residual error is 

20 determined in block 412 and an adjusted signal-to-noise ratio is calculated in 
block 414. Block 416 shows that a probability profile is developed, with the 
probability of each peak existing found in block 418. An allelic penalty may be 
applied in block 420, with the allelic penalty applied to the adjusted peak 
probability in block 422. The probability of each combination of components is 

25 calculated in block 424 with the ratio between the two highest probabilities 

being determined in block 426. If the ratio of probabilities exceeds a threshold 
value then the genotype is called in block 428. 

In another embodiment of the invention, the computing device 20 (Fig. 
30 24) supports "standardless" genotyping by identifying data peaks that contain 
putative SNPs. Standardless genotyping is used, for example, where insufficient 
information is known about the samples to determine a distribution of expected 
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peak locations, against which an allelic penalty as described above can be 
reliably calculated. This permits the computing device to be used for 
identification of peaks that contain putative SNPs from data generated by any 
assay that fragments a targeted DNA molecule. For such standardless 
5 genotyping, peaks that are associated with an area under the data curve that 
deviates significantly from the typical area of other peaks in the data spectrum 
are identified and their corresponding mass (location along the x-axis) is 
determined. 

More particularly, peaks that deviate significantly from the average area 

10 of other peaks in the data are identified, and the expected allelic ratio between 
data peaks is defined in terms of the ratio of the area under the data peaks. 
Theoretically, where each genetic loci has the same molar concentration of 
analyte, the area under each corresponding peak should be the same, thus 
producing a 1 .0 ratio of the peak area between any two peaks. In accordance 

1 5 with the invention, peaks having a smaller ratio relative to the other peaks in the 
data will not be recognized as peaks. More particularly, peaks having an area 
ratio smaller than 30% relative to a nominal value for peak area will be assigned 
an allelic penalty. The mass of the remaining peaks (their location along the x- 
axis of the data) will be determined based on oligonucleotide standards. 

20 Fig. 57 shows a flow diagram representation of the processing by the 

computing device 20 (Fig. 24) when performing standardless genotyping. In the 
first operation, represented by the flow diagram box numbered 502, the 
computing device receives data from the mass spectrometer. Next, the height 
of each putative peak in the data sample is determined, as indicated by the block 

25 504. After the height of each peak in the mass spectrometer data is 
determined, a de-noise process 505 is performed, beginning with an 
extrapolation of the noise profile (block 506), followed by finding the noise of 
each peak (block 508) and calculating the signal to noise ratio for each data 
sample (block 510). Each of these operations may be performed in accordance 

30 with the description above for denoise operations 45 of Fig. 25. Other suitable 
denoise operations will occur to those skilled in the art. 
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The next operation is to find the residual error associated with each data 
point. This is represented by the block 512 in Figure 57. The next step, block 
514, involves calculating an adjusted signal to noise ratio for each identified 
peak. A probability profile is developed next {block 516), followed by a 
5 determination of the peak probabilities at block 518. In the preferred 

embodiment, the denoise operations of Fig. 57, comprising block 502 to block 
518, comprise the corresponding operations described above in conjunction with 
Fig. 56 for block 402 through block 418, respectively. 

The next action for the standardless genotype processing is to determine 

10 an allelic penalty for each peak, indicated by the block 524. As noted above, 
the standardless genotype processing of Fig. 57 determines an allelic penalty by 
comparing area under the peaks. Therefore, rather than compare signal strength 
ratios to determine an allelic penalty, such as described above for Fig. 53, the 
standardless processing determines the area under each of the identified peaks 

15 and compares the ratio of those areas. Determining the area under each peak 
may be computed using conventional numerical analysis techniques for 
calculating the area under a curve for experimental data. 

Thus, the allelic penalty is assigned in accordance with Fig. 58, which 
shows that no penalty is assigned to peaks having a peak area relative to an 

20 expected average area value that is greater than 0.30 (30%). The allelic penalty 
is applied to the peak probability value, which may be determined according to 
the process such as described in Fig. 52. It should be apparent from Fig. 58 
that the allelic penalty imposed for peaks below a ratio of 30% is that such 
peaks will be removed from further measurement and processing. Other penalty 

25 schemes, however, may be imposed in accordance with knowledge about the 
data being processed, as determined by those skilled in the art. 

After the allelic penalty has been determined and applied, the 
standardless genotype processing compares the location of the remaining 
putative peaks to oligonucleotide standards to determine corresponding masses 

30 in the processing for block 524. For standardless genotype data, the processing 
of the block 524 is performed to determine mass and genotype, rather than 
performing the operations corresponding to block 424, 426, and 428 of Fig. 33. 
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Techniques for performing such comparisons and determining mass will be 
known to those skilled in the art. 

In another embodiment, the computing device 20 (Fig. 24) permits the 
detection and determination of the mass (location along the x-axis of the data) of 
5 the sense and antisense strand of fragments generated in the assay. If desired, 
the computing device may also detect and determine the quantity (area under 
each peak) of the respective sense and antisense strands, using a similar 
technique to that described above for standardless genotype processing. The 
data generated for each type of strand may then be combined to achieve a data 
10 redundancy and to thereby increase the confidence level of the determined 

genotype. This technique obviates primer peaks that are often observed in data 
from other diagnostic methods, thereby permitting a higher level of multiplexing. 
In addition, when quantitation is used in pooling experiments, the ratio of the 
measured peak areas is more reliably calculated than the peak identifying 
1 5 technique, due to data redundancy. 

Fig. 23 is a flow diagram that illustrates the processing implemented by 
the computing device 20 to perform sense and antisense processing. In the first 
operation, represented by the flow diagram box numbered 602,. the computing 
device receives data from the mass spectrometer. This data will include data for 
20 the sense strand and antisense strand of assay fragments. Next, the height of 
each putative peak in the data sample is determined, as indicated by the block 
604. After the height of each peak in the mass spectrometer data is 
determined, a de-noise process 605 is performed, beginning with an operation 
that extrapolates the noise profile (block 606), followed by finding the noise of 
25 each peak (block 60S) and calculating the signal to noise ratio for each data 

sample (block 610). Each of these operations may be performed in accordance 
with the description above for the denoise operations 45 of Fig. 25. Other 
suitable denoise operations will occur to those skilled in the art. The next 
operation is to find the residual error associated with each data point. This is 
30 represented by the block 612 in Figure 36. 

After the residual error for the data of the sense strand and antisens 
strand has been performed, processing to identify the genotypes will be 
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performed for the sense strand and also for the antisense strand. Therefore, Fig. 
23 shows that processing includes sense strand processing (block 630) and 
antisense strand processing (block 640). Each block 630, 640 includes 
processing that corresponds to adjusting the signal to noise ratio, developing a 
5 probability profile, determining an allelic penalty, adjusting the peak probability 
by the allelic penalty, calculating genotype probabilities, and testing genotype 
probability ratios, such as described above in conjunction with blocks 414 
through 426 of Fig. 56. The processing of each block 630, 640 may, if desired, 
include standardless processing operations such as described above in 

10 conjunction with Fig. 57. The standardless processing may be included in place 
of or in addition to the processing operations of Fig. 56. 

After the genotype probability processing is completed, the data from the 
sense strand and antisense strand processing is combined and compared to 
expected database values to obtain the benefits of data redundancy as between 

15 the sense strand and antisense strand. Those skilled in the art will understand 
techniques to take advantage of known data redundancies between a sense 
strand and antisense strand of assay fragments. This processing is represented 
by the block 650. After the data from the two strands is combined for 
processing, the genotype processing is performed (block 660) and the genotype 

20 is identified. 

Since modifications will be apparent to those of skill in this art, it is 
intended that this invention be limited only by the scope of the appended claims. 
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WHAT IS CLAIMED IS: 

1 . A subcollection of samples from a target population, comprising: 
a plurality of samples, wherein the samples are selected from the group 

consisting of blood, tissue, body fluid, cell, seed, microbe, pathogen and 
5 reproductive tissue samples; and 

a symbology on the containers containing the samples, wherein the 

symbology is representative of the source and/or history of each sample, 

wherein: 

the target population is a healthy population that has not been selected 
1 0 for any disease state; 

the collection comprises samples from the healthy population; and 
the subcollection is obtained by sorting the collection according to 
specified parameters. 

2. The subcollection of claim 1 , wherein the parameters are selected 
1 5 from the group consisting of ethnicity, age, gender, height, weight, alcohol 

intake, number of pregnancies, number of live births, vegetarians, type of 
physical activity, state of residence and/or length of residence in a particular 
state, educational level, age of parent at death, cause of parent death, former or 
current smoker, length of time as a smoker, frequency of smoking, occurrence 
20 of a disease in immediate family (parent, siblings, children), use of prescription 
drugs and/or reason therefor, length and/or number of hospital stays and 
exposure to environmental factors. 

3. The subcollection of claim 1, wherein the symbology is a bar code. 

4. A method of producing a database, comprising: 
25 identifying healthy members of a population; 

obtaining data comprising identifying information and obtaining historical 
information and data relating to the identified members of the population and 
their immediate family; 

entering the data into a database for each member of the population and 
30 associating the member and the data with an indexer. 

5. Th method of claim 4, further comprising: 
obtaining a body tissue or body fluid sample; 
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analyzing the body tissue or body fluid in the sample; and 
entering the results of the analysis for each member into the database 
and associating each result with the indexer representative of each member. 
6. A database produced by the method of claim 4. 
5 7. A database produced by the method of claim 5. 

8. A database, comprising: 

datapoints representative of a plurality of healthy organisms from 
whom biological samples are obtained, 

wherein each datapoint is associated with data representative of 
10 the organism type and other identifying information. 

9. The database of claim 8, wherein the datapoints are answers to 
questions regarding one or more of a parameters selected from the group 
consisting of ethnicity, age, gender, height, weight, alcohol intake, number of 
pregnancies, number of live births, vegetarians, type of physical activity, state of 

1 5 residence and/or length of residence in a particular state, educational level, age 
of parent at death, cause of parent death, former or current smoker, length of 
time as a smoker, frequency of smoking, occurrence of a disease in immediate 
family (parent, siblings, children), use of prescription drugs and/or reason 
therefor, length and/or number of hospital stays and exposure to environmental 

20 factors. 

10. The database of claim 9, wherein the organisms are mammals and 
the samples are body fluids or tissues. 

1 1 . The database of claim 9, wherein the samples are selected from 
blood, blood fractions, cells and subcellular organelles. 

25 12 The database of claim 8, further comprising, 

phenotypic data from an organism. 

13. The database of claim 12, wherein the data includes one of physical 
characteristics, background data, medical data, and historical data. 

14. The database of claim 8, further comprising, 

30 genotypic data from nucleic acid obtained from an organism. 
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15. The database of claim 14, wherein genotypic data includes, 
genetic markers, non-coding regions, microsatellites, RFLPs, VNTRs, historical 
data of the organism, medical history, and phenotypic information. 

16. The database of claim 8 that is a relational database. 

5 17. The database of claim 16, wherein the data are related to an 

tndexer datapoint representative of each organism from whom data is obtained. 

18. A method of identifying polymorphisms that are candidate genetic 
markers, comprising: 

identifying a polymorphism; and 
10 identifying any pathway or gene linked to the locus of the 

polymorphism, wherein 

the polymorphisms are identified in samples associated with a target 
population that comprises healthy subjects. 

19. The method of claim 18, wherein the polymorphism is identified by 
1 5 detecting the presence of target nucleic acids in a sample by a method, 

comprising the steps of: 

a) hybridizing a first oligonucleotide to the target nucleic acid; 

b) hybridizing a second oligonucleotide to an adjacent region of the 
target nucleic acid; 

20 c) ligating the hybridized oligonucleotides; and 

c) detecting hybridized first oligonucleotide by mass spectrometry as 
an indication of the presence of the target nucleic acid. 

20. The method of claim 18, wherein the polymorphism is identified by 
detecting target nucleic acids in a sample by a method, comprising the steps of: 

25 a) hybridizing a first oligonucleotide to the target nucleic acid and 

hybridizing a second oligonucleotide to an adjacent region of the target nucleic 
acid; 

b) contacting the hybridized first and second oligonucleotides with a 
cleavage enzyme to form a cleavage product; and 
30 c) detecting the cleavage product by mass spectrometry as an 

indication of the presence of the target nucleic acid. 
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21 . The method of claim 20 wherein the samples are from subjects in 
a healthy database. 

22. The method of claim 18, wherein the polymorphism is identified 
by identifying target nucleic acids in a sample by primer oligo base extension 

5 (probe). 

23. The method of 22, wherein primer oligo base extension, 
comprises: 

a) obtaining a nucleic acid molecule that contains a target nucleotide; 

b) optionally immobilizing the nucleic acid molecule onto a solid support, 
10 to produce an immobilized nucleic acid molecule; 

c) hybridizing the nucleic acid molecule with a primer oligonucleotide that 
is complementary to the nucleic acid molecule at a site adjacent to the target 
nucleotide; 

d) contacting the product of step c) with a composition comprising a 
15 dideoxynucleoside triphosphate or a 3'-deoxynucleoside triphosphates and a 

polymerase, so that only a dideoxynucleoside or 3'-deoxynucleoside triphosphate 
that is complementary to the target nucleotide is extended onto the primer; and 

e) detecting the extended primer, thereby identifying the target 
nucleotide. 

20 24. The method of claim 23, wherein detection of the extended primer 

is effected by mass spectrometry, comprising: 

ionizing and volatizing the product of step d) ; and 

detecting the extended primer by mass spectrometry, thereby identifying 

the target nucleotide. 
25 25. The method of claim 24, wherein; 

samples are presented to the mass spectrometer as arrays on chips; and 
each sample occupies a volume that is about the size of the laser spot 

projected by the laser in a mass spectrometer used in matrix-assisted laser 

desorption/ionization (MALDl) spectrometry. 
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26. A combination, comprising: 

a database containing parameters associated with a datapoint 
representative of a subject from whom samples are obtained, wherein the 
subjects are healthy; and 
5 an indexed collection of the samples, wherein the index identifies the 

subject from whom the sample was obtained. 

27 The combination of claim 26, wherein the parameter is selected 
from the group consisting of ethnicity, age, gender, height, weight, alcohol 
intake, number of pregnancies, number of live births, vegetarians, type of 
10 physical activity, state of residence and/or length of residence in a particular 

state, educational level, age of parent at death, cause of parent death, former or 
current smoker, length of time as a smoker, frequency of smoking, occurrence 
of disease in immediate family (parent, siblings, children), use of prescription 
drugs and/or reason therefor, length and/or number of hospital stays and 
1 5 ecposure to environmental factors. 

28. The combination of claim 26, wherein the database further 
contains genotypic data for each subject. 

29. The combination of claim 26, wherein the samples are blood. 
30 A data storage medium, comprising the database of claim 8. 

20 31 . A computer system, comprising the database of claim 8. 

32. A system for high throughput processing of biological samples, 
comprising: 

a process line comprising a plurality of processing stations, each of which 
performs a procedure on a biological sample contained in a 
25 reaction vessel; 

a robotic system that transports the reaction vessel from processing 

station to processing station; 
a data analysis system that receives test results of the process line and 
automatically processes the test results to make a determination 
30 regarding the biological sample in the reaction vessel; 

a control system that determines when the test at each processing 

station is complete and, in response, moves the reaction vessel to 
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the next test station, and continuously processes reaction vessels 
one after another until the control system receives a stop 
instruction; and 

a database of claim 8, wherein the samples tested by the automated 
5 process line comprise samples from subjects in the database. 

33. The system of claim 32, wherein one of the processing stations 
comprises a mass spectrometer. 

34. The system of claim 32, wherein the data analysis system 
processes the test results by receiving test data from the mass spectrometer 

10 such that the test data for a biological sample contains one or more signals, 
whereupon the data analysis system determines the area under the curve of 
each signal and normalizes the results thereof and obtains a substantially 
quantitative result representative of the relative amounts of components in the 
tested sample. 

15 35. A method for high throughput processing of biological samples, 

the method comprising: 

transporting a reaction vessel along a system of claim 32, comprising a 
process line having a plurality of processing stations, each of 
which performs a procedure on one or more biological samples 
20 contained in the reaction vessel; 

determining when the test procedure at each processing station is 

complete and f in response, moving the reaction vessel to the next 
processing station; 

receiving test results of the process line and automatically processing the 
25 test results to make a data analysis determination regarding the 

biological samples in the reaction vessel; and 
processing reaction vessels continuously one after another until receiving 
a stop instruction, wherein the samples tested by the automated 
process line comprise samples from subjects in the database. 
30 36. The method of 35, wherein one of the processing stations 

comprises a mass spectrometer. 
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37. The method of claim 36, wherein the samples are analyzed by a 
method comprising primer oligo base extension (probe). 

38. The method of claim 37, further comprising: 
processing the test results by receiving test data from the mass 

5 spectrometer such that the test data for a biological sample contains one or 
more signals or numerical values representative of signals, whereupon the data 
analysis system determines the area under the curve of each signal and 
normalizes the results thereof and obtains a substantially quantitative result 
representative of the relative amounts of components in the tested sample. 
10 39. The method of claim 37, wherein primer oligo base extension, 

comprises: 

a) obtaining a nucleic acid molecule that contains a target nucleotide; 

b) optionally immobilizing the nucleic acid molecule onto a solid support, 
to produce an immobilized nucleic acid molecule; 

15 c) hybridizing the nucleic acid molecule with a primer oligonucleotide that 

is complementary to the nucleic acid molecule at a site adjacent to the target 
nucleotide; 

d) contacting the product of step c) with composition comprising a 
dideoxynucleoside triphosphate or a 3'-deoxynucleoside triphosphates and a 

20 polymerase, so that only a dideoxynucleoside or 3'-deoxynucleoside triphosphate 
that is complementary to the target nucleotide is extended onto the primer; and 

e) detecting the primer, thereby identifying the target nucleotide. 

40. The method of 39, wherein detection of the extended primer is 
effected by mass spectrometry, comprising: 

25 ionizing and volatizing the product of step d); and 

detecting the extended primer by mass spectrometry, thereby identifying 
the target nucleotide. 

41 . The method of claim 36, wherein the target nucleic acids in the 
sample are detected and/or identified by a method, comprising the steps of: 

30 a) hybridizing a first oligonucleotide to the target nucleic acid; 

b) hybridizing a second oligonucleotide to an adjacent region of the 
target nucleic acid; 
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c) ligating then hybridized oligonucleotides; and 

c) detecting hybridized first oligonucleotide by mass spectrometry as 
an indication of the presence of the target nucleic acid. 

42. The method of claim 36, wherein the target nucleic acids in the 
5 sample are detected and/or identified by a method, comprising the steps of: 

a) hybridizing a first oligonucleotide to the target nucleic acid and 

* hybridizing a second oligonucleotide to an adjacent region of the target nucleic 
acid; 

b) contacting the hybridized first and second oligonucleotides with a 
1 0 cleavage enzyme to form a cleavage product; and 

c) detecting the cleavage product by mass spectrometry as an 
indication of the presence of the target nucleic acid. 

43. A method of producing a database stored in a computer memory, 
comprising: 

1 5 identifying healthy members of a population; 

obtaining identifying and historical information and data relating to the 
identified members of the population; 

entering the member-related data into the computer memory database for 
each identified member of the population and associating the member and the 
20 data with an indexer. 

44. The method of claim 43, further comprising: 

obtaining a body tissue or body fluid sample of an identified member; 
analyzing the body tissue or body fluid in the sample; and 
entering the results of the analysis for each member into the computer 
25 memory database and associating each result with the indexer representative of 
each member. 

45. A database produced by the method of claim 43. 

46. A database produced by the method of claim 44. 

47. The database of claim 8, wherein: 

30 the organims are selected from among animals, bacteria, fungi, 

protozoans and parasites and 
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each datapoint is associated with parameters representative of the 
organism type and identifying information. 

48. The database of claim 43, further comprising, 
phenotypic data regarding each subject. 
5 49. The database of claim 47 that is a relational database and the 

parameters are the answers to the questions in the questionnaire. 

50. The database of claim 8, further comprising, 

genotypic data of nucleic acid of the subject, wherein genotypic data 
includes, but is not limited to, genetic markers, non-coding regions, 
10 microsatellites, restriction fragment length polymorphisms (RFLPs), variable 
number tandem repeats (VNTRs), historical day of the organism, the medical 
history of the subject, phenotypic information, and other information. 

51. A database, comprising data records stored in computer memory, 
wherein the data records contain information that identifies healthy members of 

15 a population, and also contain identifying and historical information and data 
relating to the identified members. 

52. The database of claim 51, further comprising an index value for 
each identified member that associates each member of the population with the 
identifying and historical information and data. 

20 53. A computer system, comprising the database of claim 51. 

54. An automated process line, comprising the database of claim 51. 

55. A method for determining a polymorphism that correlates with 
age, ethnicity or gender, comprising: 

identifying a polymorphism; and 
25 determining the frequency of the polymorphism with increasing age, with 

ethnicity or with gender in a healthy population. 

56. A method for determining whether a polymorphism correlates with 
suceptibility to morbidity, early mortality, or morbidity and early mortality, 
comprising; 

30 identifying a polymorphism; and 

determining the frequency of the polymorphism with increasing age in a 
healthy population. 
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57. A high throughput method of determining frequencies of genetic 
variations, comprising: 

selecting a healthy target population and a genetic variation to be 
assessed; 

5 pooling a plurality of samples of biopolymers obtained from members of 

the population, 

determining or detecting the biopolymer that comprises the variation by 
mass spectrometry; 

obtaining a mass spectrum or a digital representation thereof; and 
10 determining the frequency of the variation in the population. 

58. The method of claim 57, wherein: 

the variation is selected from the group consisting of an allelic variation, a 
post-translational modification, a nucleic modification, a label, a mass 
modification of a nucleic acid and methylation; and/or 

15 the biopolymer is a nucleic acid, a protein, a polysaccharide, a lipid, a 

small organic metabolite or intermediate, wherein the concentration of 
biopolymer of interest is the same in each of the samples; and/or 

the frequency is determined by assessing the method comprising 
determining the area under the peak in the mass spectrum or digital 

20 repesentation thereof corresponding to the mass of the biopolymer comprising 
the genomic variation. 

59. The method of claim 58, wherein the method for determining the 
frequency is effected by determining the ratio of the signal or the digital 
representation thereof to the total area of the entire mass spectrum, which is 

25 corrected for background. 

60. A method for discovery of a polymorphism in a population, 
comprising: 

sorting the database of claim 8 according to a selected parameter to 
identify samples that match the selected parameter; 
30 isolating a biopolymer from each identified sample; 

optionally pooling each isolated biopolymer; 
optionally amplifying the amount of biopolymer; 
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cleaving the pooled biopolymers to produce fragments thereof; 

obtaining a mass spectrum of the resulting fragments and comparing the 
mass spectrum with a control mass spectrum to identify differences between the 
spectra and thereby identifing any polymorphisms; wherein: 
5 the control mass spectrum is obtained from unsorted samples in the 

collection or samples sorted according to a different parameter. 

61. The method of claim 60, wherein cleaving is effected by contacting 
the biopolymer with an enzyme. 

62. The method of claim 61, wherein the enzyme is selected from the 
10 group consisting of nucleotide glycosylase, a nickase and a type IIS restriction 

enzyme. 

63. The method of claim 60, wherein the biopolymer is a nucleic acid 
or a protein. 

64. The method of claim 60, wherein the the mass spectrometric 
1 5 format is selected from among Matrix-Assisted Laser Desorption/lonization, 

Time-of-Flight (MALDI-TOF), Electrospray (ES), IR-MALDI, Ion Cyclotron 
Resonance (ICR), Fourier Transform and combinations thereof. 

65. A method for discovery of a polymorphism in a population, 
comprising: 

20 obtaining samples of body tissue or fluid from a plurality of organisms; 

isolating a biopolymer from each sample; 
pooling each isolated biopolymer; 
optionally amplifying the amount of biopolymer; 
cleaving the pooled biopolymers to produce fragments thereof; 
25 obtaining a mass spectrum of the resulting fragments; 

comparing the frequency of each fragment to identify fragments present 
in amounts lower than the average frequency, thereby identifying any 
polymorphisms. 

66. The method of claim 65, wherein cleaving is effected by contacting 
30 the biopolymer with an enzyme. 
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67. The method of claim 66, wherein the enzyme is seiected from the 
group consisting of nucleotide glycosylase, a nickase and a type IIS restriction 
enzyme. 

68. The method of claim 65, wherein the biopolymer is a nucleic acid 
5 or a protein. 

69. The method of claim 65, wherein the the mass spectrometric 
format is selected from among Matrix-Assisted Laser Desorption/lonization, 
Time-of-Flight (MALDI-TOF), Electrospray <ES), IR-MALDI, Ion Cyclotron 
Resonance (ICR), Fourier Transform and combinations thereof. 

10 70. The method of claim 65, wherein the samples are obtained from 

healthy subjects. 

71. A method of correlating a polymorphism with a parameter, 
comprising: 

sorting the database of claim 8 according to a selected parameter to 
1 5 identify samples that match the selected parameter; 

isolating a biopolymer from each identified sample; 

pooling each isolated biopolymer; 

optionally amplifying the amount of biopolymer; 

determining the frequency of the polymorphism in the pooled 
20 biopolymers, wherein: 

an alteration of the frequency of the polymorphism compared to a control, 
indicates a correlation of the polymorphism with the selected parameter; and 

the control is the frequency of the polymorphism in pooled biopolymers 
obtained from samples identified from an unsorted database or from a database 
25 sorting according to a different parameter. 

72. The method claim 71, wherein the parameter is selected from the 
group consisting of ethnicity, age, gender, height, weight, alcohol intake, 
number of pregnancies, number of live births, vegetarians, type of physical 
activity, state of residence and/or length of residence in a particular state, 

30 educational level, age of parent at death, cause of parent death, former or 

current smoker, length of time as a smok r, frequency of smoking, occurrence 
of a disease in immediate family (parent, siblings, children), use of prescription 
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drugs and/or reason therefor, length and/or number of hospital stays and 
exposure to environmental factors. 

73. The method claim 72, wherein the parameter is occurrence of 
disease or a particular disease in an immediate family member, thereby 

5 correlating the polymorphism with the disease. 

74. The method of claim 71, wherein the pooled biopolymers are 
pooled nucleic acid molecules. 

75. The method of claim 74, wherein the polymorphism is detected 
by primer oligo base extension (PROBE). 

10 76. The method of 75, wherein primer oligo base extension, 

comprises: 

a) optionally immobilizing the nucleic acid molecules onto a solid support, 
to produce immobilized nucleic acid molecules; 

b) hybridizing the nucleic acid molecules with a primer oligonucleotide 
1 5 that is complementary to the nucleic acid molecule at a site adjacent to the 

polymorphism; 

c) contacting the product of step c) with composition comprising a 
dideoxynucleoside triphosphate or a 3'-deoxynucleoside triphosphates and a 
polymerase, so that only a dideoxynucleoside or 3'-deoxynucleoside triphosphate 

20 that is complementary to the polymorphism is extended onto the primer; and 

d) detecting the extended primer, thereby detecting the polymorphism in 
nucleic acid molecules in the pooled nucleic acids. 

77. The method of claim 76, wherein detecting is effected by mass 
spectrometry. 

25 78. The method of claim 71, wherein the frequency is percentage of 

nucleic acid molecules in the pooled nucleic acids that contain the 
polymorphism. 

79. The method of claim 78, wherein the ratio is determined by 
obtaining mass spectra of the pooled nucleic acids. 
30 80. The method of claim 72, wherein the parameter is age, thereby 

correlating the polymorphism with suceptibility to morbidity, early mortality or 
morbidity and early mortality. 
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81. A method for hapiotyping polymorphisms in a nucleic acid, 
comprising: 

(a) sorting the database of claim 8 according to a selected parameter 
to identify samples that match the selected parameter; 
5 (b) isolating nucleic acid from each identified sample; 

(c) optionally pooling each isolated nucleic acid; 

(d) amplifying the amount of nucleic acid; 

(e) forming single-stranded nucleic acid and splitting each single- 
strand into a separate reaction vessel; 

10 (f) contacting each single-stranded nucleic acid with an adaptor 

nucleic acid to form an adaptor complex; 

(g) contacting the adaptor complex with a nuclease and a ligase; 

(h) contacting the products of step (g) with a mixture that is capable 
of amplifying a ligated adaptor to produce an extended product; 

1 5 (i) obtaining a mass spectrum of each nucleic acid resulting from step 

(h) and detecting a polymorphism by identifying a signal corresponding to the 
extended product; 

(j) repeating steps (f) through (i) utilizing an adaptor nucleic acid able 
to hybridize with another adapter nucleic acid that hybridizes to a different 
20 sequence on the same strand; whereby 

the polymorphisms are haplotyped by detecting more than one extended 
product. 

82. The method of claim 1 , wherein the nuclease is Fen-1 . 

83. A method for hapiotyping polymorphisms in a population, 
25 comprising: 

sorting the database of claim 8 according to a selected parameter to 
identify samples that match the selected parameter; 

isolating a nucleic acid from each identified sample; 
pooling each isolated nucleic acid; 
30 optionally amplifying the amount of nucleic acid; 

contacting the nucleic acid with at least one enzyme to produce 
fragments thereof; 
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obtaining a mass spectrum of the resulting fragments; whereby: 

the polymorphisms are detected by detecting signals corresponding to the 

polymorphisms; and 

the polymorphisms are haplotyped by determining from the mass 
5 spectrum that the polymorphisms are located on the same strand of the nucleic 

acid. 

84. The method of claim 83, wherein the enzyme is a nickase. 

85. The method of claim 84, wherein the nickase is selected from the 
group consisting of NY2A and NYS1. 

10 86. A method for detecting methylated nucleotides within a nucleic 

acid sample, comprising: 

splitting a nucleic acid sample into separate reaction vessels; 
contacting nucleic acid in one reaction vessel with bisulfite; 
amplifying the nucleic acid in each reaction vessel; 
1 5 cleaving the nucleic acids in each reaction vessel to produce fragments 

thereof; 

obtaining a mass spectrum of the resulting fragments from one reaction 
vessel and another mass spectrum of the resulting fragements from another 
reaction vessel; whereby: 
20 cytosine methylation is detected by identifying a difference in signals 

between the mass spectra. 

87. The method of claim 86, wherein: 

the step of amplifying is carried out in the presence of uracil; and 
the step of cleaving is effected by a uracil glycosylase. 
25 88. A method for identifying a biological sample, comprising: 

generating a data set indicative of the composition of the biological 
sample; 

denoising the data set to generate denoised data; 

deleting the baseline from the denoised data to generate an intermediate 

30 data 
set; 

defining putative peaks for the biological sample; 
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using the putative peaks to generate a residual baseline; 
removing the residual baseline from the intermediate data set to generate 
a corrected data set; 

locating, responsive to removing the residual baseline, a probable peak in 

5 the 

corrected data set; and 

identifying, using the located probable peak, the biological sample- 
wherein the generated biological sample data set comprises data from 

sense 

10 strands and antisense strands of assay fragments. 

89. The method according to claim 88, wherein identifying includes 
combining 

data from the sense strands and the antisense strands, and comparing the data 
against expected sense strand and antisense strand values, to identify the 
1 5 biological 
sample. 

90. The method according to claim 88, wherein identifying includes 
deriving a peak probability for the probable peak, in accordance with whether the 
probable peak is from sense strand data or from antisense strand data. 

20 91. The method according to claim 88, wherein identifying includes 

deriving a peak probability for the probable peak and applying an allelic penalty in 
response to a 

ratio between a calculated area under the probable peak and a calculated 
expected average area under all peaks in the data set. 
25 92. A method for identifying a biological sample, comprising: 

generating a data set indicative of the composition of the biological 
sample; 

denoising the data set to generate denoised data; 

deleting the baseline from the denoised data to generate an intermediate 

30 data 
set; 
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defining putative peaks for the biological sample; using the 

putative peaks to generate a residual baseline- 
removing the residual baseline from the intermediate data set to generate 

a 

5 corrected data set; 

locating, responsive to removing the residual baseline, a probable peak in 
the corrected data set; and 

identifying, using the located probable peak, the biological sample- 
wherein identifying includes deriving a peak probability for the probable 
10 peak and 

applying an allelic penalty in response to a ratio between a calculated 
area under the 

probable peak and a calculated expected average area under all peaks in the data 
set. 

15 93. The method according to claim 92, wherein identifying includes 

comparing 

data from probable peaks that did not receive an applied allelic penalty to 
determine their mass in accordance with oligonucleotide biological data. 

94. The method according to claim 92, wherein the allelic penalty is 
20 not applied to probable peaks whose ratio of area under the peak to the 

expected area value is greater than 30%. 

95. A method for detecting a polymorphism in a nucleic acid, 
comprising: 

amplifying a region of the nucleic acid to produce an amplicon, wherein 
25 the resulting amplicon comprises one or more enzyme restriction sites; 

contacting the amplicon with a restriction enzyme to produce fragments; 
obtaining a mass spectrum of the resulting fragments and analyzing 
signals in the mass spectrum by the method of claim 88; whereby: 
the polymorphism is detected from the pattern of the signals. 
30 96. A subcollection of samples from a target population, comprising: 

a plurality of samples, wherein the samples are selected from the group 
consisting of nucleic acids, fetal tissue, protein samples; and 
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a symbology on the containers containing the samples, wherein the 
symbology is representative of the source and/or history of each sample, 
wherein: 

the target population is a healthy population that has not been selected 
5 for any disease state; 

the collection comprises samples from the healthy population; and 
the subcollection is obtained by sorting the collection according to 
specified parameters. 

97. The combination of claim 26, wherein the samples are selected 
10 selected from the group consisting of nucleic acids, fetal tissue, protein, tissue, 

body fluid, cell, seed, microbe, pathogen and reproductive tissue samples. 

98. A combination, comprising the database of claim 8 and a mass 
spectrometer. 

99. The combination of claim 98 that is an automated process line for 
15 analyzing biological samples. 

100. A system for high throughput processing of biological samples, 
comprising: 

a database of claim 8, wherein the samples tested by the automated 
process line comprise samples from subjects in the database; and 
20 a mass spectrometry for analysis of biopolymers in the samples. 
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Age-Related Frequency of the 29 IS Allele 
(LPL) Caucasians 
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Questionnaire for 
Population— Based 
Sample Banking 

Data Collection Form 

Collection Information 

Consent Form Signed Yes No 
Date of Collection (MM/DD/YY)_/_/98 

Time of Sample Collection(nearest hour in 24 hour clock format) 

Initials of Data Collector Collecting Agency 

(DO NOT COMPLETE: (For Date Entry Only)Sample! intact lost 

Donor information 

Sex: □ Male □ Female Date of Birth (MM/YY)_/_ 

In which state do you live? How long have you lived there ? Years 

What is your highest grade you completed in school? 

□ less than 8th grade □ 8th,9th,10th or 11th grade □ high school graduate or equivalency 

□ some college 2 yr degree □ college graduate 4 yr degree □ post graduate education or degree 
To the best of your knowledge what is the Ethnic Origin of your 



Father Mother 

□ □ Caucasian (please check specific geographic area below if known) 

□ □ Northern Europe (Austria t Denmark,nnland ( France ( Germany l Netheriands ( Norway,Sweden ( Switzeriand,U.K.) 

□ □ Southern Europe (GreeceJtaly.Spain) 

□ □ Eastern Europe (Czechoslovakia,Hungary,Poland,Russia t Yugosfavia) 

O D Middle Eastern (IsraeI,EgypUranJraq,Jordan,Syria, other Arab States) 

□ □ African-American 

□ □ Hispanic (please check specific geographic area below if known) 

□ □ Mexico 

D □ Central America.South American 

D □ Cuba.Puerto Rico, other Caribbean 

□ □ Asian (please check specific geographic area below if known) 

□ □ Japanese 

□ □ Chinese 

□ □ Korean 

□ D Vietnamese 
a □ other Asian 

□ □ Other 

a □ Don't know 

Health information: Have you or has anyone in your immediate family(parents l brothers t ststers l or your children) 
had the following? Check all that apply 

Disease: You Mother Father Sister Brother Child 



Heart Disease Stroke or Arteriosclerosis 

Cancer (Specify type if known) 

Alzheimer's Disease or Dementia 

Chronic inflammatory or Autoimmune Disease 
Nervous System Disease like Multiple Sclerosis 
Other (please specify) 



Additional health information details you would like to provide: 



FIG. 3 

SUBSTITUTE SHEET (RULE 26) 

=DOCID <WO 0127857A3 IA> 



Affix Barcode Here 

.broken 



WO 01/027857 



7/51 



PCT/US00/28413 



5 



C/JOLlIUJ 



o c o P 



3 



V 



A 



■BCD 



" s! .s < ? 



A 
1/ 



A 
V 



o 

i 

1 



11? 

5 



lit 



|1T Hi 










Stock 
Solution 









CO 

J £ 

CO 


^2 
Sen 

B« 


Dry Ambient 
Temperature 


m 

Si 
a« 




§f 
5 • 












Hi 




r 




? 



J?DOCin <WO 0l?7OF7A-3 IA:. 



SUBSTITUTE SHEET (RULE 26) 



WO 01/027857 



8/51 



PCT/US00/28413 




[%] /(ouanbajj oi|9||D 



-DOCtO <WC 01?7957A? IA> 

A. 



SUBSTITUTE SHEET (RULE 26) 

I 



WO 01/027857 



9/51 



PCT/USOO/28413 




SUBSTITUTE SHEET (RULE 26) 



WO 01/027857 



10/51 



PCT/US00/28413 




[%] yCouanbay 



SUBSTITUTE SHEET (RULE 26) 



WO 01/027857 



11/51 



PCT/US00/28413 




CM CNJ 

on a. 



"3 



.2 

a 

^£ 

^ .2 

I | 
SB 

ro 

ID 



a> 
I 



23.97 



76.03 



25.71 



74.29 



i ; r~ 

o o o 

o o o 

o o o 

o ao co 



o o o 

o o o 

o o o 

CM 



[%] sdpudnbaj^ oi|9||V 



CM CM 



CO 

eg 



c to 
g E 

i = 

g-5 



« CO 



54.35 
45.65 



42.48 
57.52 





CO 
0> 




28.88 



71.12 



i — i — ! — i — r 



2 
"to 
o 
o 

O 

o 




£2 



SUBSTITUTE SHEET (RULE 26) 



WO 01/027857 



12/51 



PCT/USOO/28413 



o 




SUBSTITUTE SHEET (RULE 26) 



WO 01/027857 



13/51 



PCT/US00/28413 




\'SDOCID- <WO 0127S57A? tA> 



SUBSTITUTE SHEET (RULE 26) 



WO 01/027857 



14/51 



PCT/US00/28413 



>\ 

o 
e 

cr 



Age-Related p21 s31R Allele and Genotype Frequencies 
Caucasian Population 




*S31 



□ 18-49 (n=910) 



Alleles and Genotypes 



50-79 (n=824) 



Significance: Genotype frequencyof SR heterozygous 
drops from 13.3% to 9.2%; p=0.009 



FIG. 8 



"Doer* <wr ni27 < ?57A? ia-- 



SUBSTITUTE SHEET (RULE 26) 



WO 01/027857 



15/51 



PCT/US00/28413 




[%] Aousnbajj 



SUBSTITUTE SHEET (RULE 26) 



WO 01/027857 



16/51 



PCT/US00/28413 




SUBSTITUTE SHEET (RULE 26) 



WO 01/027857 



17/51 



PCT/US00/28413 




SUBSTITUTE SHEET (RULE 26) 



WO 01/027857 



18/51 



PCT/US00/28413 




O O O O O CO 
4 K) M 




SUBSTITUTE SHEET (RULE 26) 



-DOCID <WC D127e57A? IA> 




S'SDOCID <WC 0127657A3 IA> 



SUBSTITUTE SHEET (RULE 26) 



WO 01/027857 



20/51 



PCT/US00/28413 




SUBSTITUTE SHEET (RULE 26) 



OOCID <WO 01?79E7A? IA> 



WO 01/027857 PCT/US00/28413 

21/51 




o o o 

CO to 



SUBSTITUTE SHEET (RULE 26) 



WO 01/027857 PCT/US00/2841J 

22/51 



P53-Rb Pathway 



p16 




S- phase 



FIG. 16 



SUBSTITUTE SHEET (RULE 26) 



WO 01/027857 PCT/US00/28413 

23/51 



1700 



1702" 



1707' 



1708 



1714' 



CPU 



DASD 



MEMORY 



NETWORK 
INTERFACE 



1716 



DISPLAY 



KEYBOARD 



PROGRAM 
PRODUCT 
READER 

1 



n 



,1706 



• 1704 



1710 



■1712 



NETWORK 



-1713 



•1700 



FIG. 17 



MSDOCID' <WO 0127B57A3 IA> 



SUBSTITUTE SHEET (RULE 26) 



WO 01/027857 



PCT/USOO/28413 



24/51 



1802- 



1804- 



C START ) 



IDENTIFY HEALTHY 
MEMBERS O F A POPULATION. 

I 



OBTAIN IDENTIFYING AND HISTORICAL 
INFORMATION AND DATA RELATING TO 
THE IDENTIFIED MEMBERS. 



1806- 



1808- 



I 



ENTER THE OBTAINED 
MEMBER INFORMATION AND 
DATA INTO A DATABASE. 



I 



ASSOCIATE THE MEMBER 
INFORMATION AND DATA 
WITH AN INDEX VALUE. 

~7=i=r 

Q CONTINUE ) 



FIG. 18 



-DOO'D -Wr 1 tA> 



SUBSTITUTE SHEET (RULE 26) 



WO 01/027857 PCT/US00/28413 

25/51 



clone chromosome 17(#4B319) 




*T C CT 
females 

E3<40 



*C *T C CT T 
total population 

!>60 



FIG. 19 



SUBSTITUTE SHEET (RULE 26) 



WO 01/027857 



26/51 



PCT/tfSOO/28413 



AKAP10 He646Va! 



t 




*G *A G GA 
females 

S<40 



♦G *A G GA 

males 



FIG. 20 



♦G *A G GA 
total population 

l>60 



SUBSTITUTE SHEET (RULE 26) 



WO 01/027857 PCT/U SOU/284 13 

27/51 * 



methionine sulfoxide reductase A (#63306) 



g 50.00 
S. 40.00 A 




*C *T C CT T 
females 

□<40 



*C *T C CT T 
males 



FIG. 21 



«C *T C CT T 
total population 

■>60 



.'9 DOC' <W? 0t?7o?7A' a IA> 



SUBSTITUTE SHEET (RULE 26) 



WO 01/027857 PCT/USOO/28413 

28/51 



Collection Information 



Consent Form Signed 
CD Yes CD No 



Dote of CoHect ioon 
Month 



JAN 
FEB 
MAR 
APR 
MAY 
JUN 
JUL 

AUG cd 



SEP ( 
OCT I 
NOV I 
DEC CD| 



Ucy 



I 



Year 

21 QlO i 



'I u it iuCDDM 



\ > j nmmi i h t i 



1 1 11 z 1 ^* r T1 r 7 ir 7 1 



cnryirsi r*«pr^ i 
ifTT*li ^ irft ii u i 



Time of Sample 
, Collection 
(nearest hour, in 
24 hour clock 
format) 



r rprr p 
=DCD 

3cn 

33 CD 

upec 
sen 

3t3C33 




Initials of Data Collector 



(DO NOT COMPLETE; 
for data entry only) 


Sample; 
□ Intact 
CD Lost 
CD Broken 


Volume 
(ml) 






can cm 

CUCXl 

C31C33 
cpnp 
C53CS3 



Donor Information 



Dote ol Birth 
Month j TeoT 

M I SI 



JAN i mmfT'llMlim 

FEB awzimm 

MAR CDfacZJCPCEi 

apr crzrocpcsr^ 



3COGBDO 

^nrrrmegs 



1 " 4 1 8 1 1 S B ir "TP 





; Height 




B3 


Inches 


Sex: 


1 






CD Male 
CD Female 










rnrsn 
330TJI 

3303 
33 







Weight 
Ob) 



ceo cm CCD 

CDCOCO 
CJ3CDCE1 

CQCDcn 
rfl*trft->i >t ? 



What Physical 
activity do you do 
on o regular bos/s? 

CD Running 
CD Swimming 
CD Biking 
CD Gymnastics 
CD Other 
CD None 



Are you a 
vegetarian? 

CD Yes 
CD No 



How many 1 
times hove \ 
you been 
pregnant? 



How many | 
times did 
you give 
birth? 



In which 
state do 
you live? 

m 



To the best of your knowledge, what is the Ethnic Origin of your 
Fath«>r Itoltjfir. 

Caucosion (please mark specific geographic area below if known) 

Northern Europe (Austria. Denmark, Finland, France, Germany, Netherlands, Norway, Sweden. Switzerland, UK) 
Southern Europe (Greece. Italy, Spain, Turkey) 
Eastern Europe (Czechoslovakia, Hungary, Poland, Russia, Yugoslavia) 
Middle Eastern (Israel, Egypt, Iran, Iraq, Jordan, Syria, Other Arab States) 

African-American 

Hispanic (please mark specific geographic area below if known) 
Mexico 

Central America, South Americo 
Cuba, Puerto Rico, other Caribbean 

Asian (please mark specific geographic area below if known) 
Japanese 
Chinese 
Korean 
Vietnamese 
Filipino 

Native American 



CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 



Other 

Don't know 



How long 
hove you 
Tived 
there? 



1 mi rj i 
33CP 
D3CD 

D3CD 

J3-imj 



What is your highest grade MfilhfiT Deceased? Cause of Death Mother fnthgr Deceased? Cause of Death Father 
you completed in school? mY „ , , CDYes 

=D No 



CD less then 8th grade 
CD8th,9th,10th,or11th grade 
CD high school graduate or 

equivalency 
CD some college, 2yr degree 
CD college graduate t 4yr degree 
CD post graduate education or 

degree 



CDYes 
CD No 



If Yes at 
what age? 



<. 29 


CD Heart Disease 


30-39 


CD Cancer 


40-49 


CD Stroke 


50-59 


CD Accident 


60-69 


CD Suicide 


70-79 


CD Other, 


80-89 


2. 90 





If Yes at 
what age? 
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Have you ever smoked? CZD Yea CD No 
If yes, for how long? 



r p irf^ 
f \ if rn 
CZ203 



r«nrs-t 



Years 



Have you been hospitalized 
in the past 5 years for more 
then 6 days at o time? 
CD Yes a No 

If yes, how many times? 



For each hospitalization 
(if not the same) 
how long did you stay " 
and for what reason? 



1) Weeks: mmmmrn rp 

CD Acute disorder, including infection ond thrombosis 

CD Chronic Disorder 

CD Accident 

CD Other. 

2) Weeks: m rr-iryir^-irtri m 

CD Acute disorder, including infection and thrombosis 

CD Chronic Disorder 

CD Accident 

f=J Other 

3) Weeks: r T^ r ?^ £?^r*T7 *"S" l| lt H 

CD Acute disorder, including infection and thrombosis 

CD Chronic Disorder 

CD Accident 

CD Other. 



Hove you or has anyone in your immediate family (parents f brcthers ( sistefs,or your children) had the following? 
Mark all thai apply! 

Disease 

Heart Disease, including arteriosclerosis 



Stro ke 

tension 



loop dots 
Diabetes, insulin dependent 



Diabetes, not insulin-dependent (diet controlled; 
Cancer. 



Ung&Bronchus 



Breosts 
Prompt* 



Coton&Rectum 
5km 



iphomo&Leukem'Hi 



piease specify below: 



ttznetme r'a 
Esflepsy 



Scmzophreruo 



ispotor disorder (mcnic depression) 



yo|Q r depression 
IChronic Inftanrvnctory or Autoimmune Disease indudinq 

Ultitiple Sclerosis ond Rheumatoid Arthritis 

(Emphysema 



please specify below: 



You 


Mother 


Father 


Sister 


Brother 


Child 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 
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CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD- 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 



Do you take prescription drugs on a regular basis? 



U yes, please specify below: 



CD Yes CD No 



Hove you ever donated blood before? CD Yes 
ff yes. how many times: Number of Times 



3 No Additional health information details you would like to provide: 



L2JL2J2D 



13JLM II M I 
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USE ONLY 



Do you drink any kind of alcoholic beverage? 

CD Hever CDHardty ever 

CDLess than 3 times per week CD 3 or more times per week 
CD Daily 
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I Collection Information 



Consent Form Signed 
CD Yes a No 



Data 


of Collection 


Montfl 




.Year . 






2| 0|01 


JAN CD] 

FEB cd 
MAR cd 




apr a 
MAY en 
JUM cd 
JUL a 
AUG cd 
SEP CD 
OCT CD 

nov cd 

DEC a 


n [ Tnr|'^*ir*ni .I ir.^O 
1 Jt II <H ft, If 4-.J 1 i II A 1 

i vn iLSJTTirTirsj 

i fi » Mi it ii B ntnrfri 
1 7 il 7 II 7 II 7 11 T 

ftfrT^^fty ii*o ir^rt i 



Time of Sample 

Collection 
(nearest hour in 
24hour dock 
format) 



Initials 



rA ^ iN * * a " yp 

r pirrjn ? ifTp 

CTflffDCtmEt 

CGCvJtSDCD 

rfj"irrT*irtt"frfn 
CDnDCtDCD 
EtJCffl CdDTJESD 

uui x irmcp 
rmi? ii mm 



Initiab of Ooto Collector 



(DO NOT COMPLETE 
for dato entry only) 



Sample: 

CD Intoct 
CD Lost 
CD Broken 



Volume 

M 



i n ii ft t 



? II ? I 



mm 



53CP 



lOonor Information 



Data of Birth 


Month 


Year 






JAN cq 
FEB CD 
MAR CD 
APR CD 
MAY CD 
JUN CD 
JUL CD 
AUG CD 
SEP CD 
OCT CD 
NOV CD 
DEC CD 


1 U u pnt rj it ijj > 
VKDDcanc 

i yn 7 it 2 ^ r £ * 
r*|yir^i r^ir^*' 

1 4 |1 JlTi A ii 4H 

r<y*ir*pr *j f ^ i 
r^'irB*7rft u ffi 
1 7 M 7 II 7 117 1 
rg it fl^r^T yi 



Sex: 



JMcle 
I Female 



Height 


Ft [Inches 


1 




IE 

:£ 


ran n 
< * t||m 


CCD 



Weight 



ccai 

OD I I II 1 I 
CSI 2 » 2 1 
i ^ ir^it ?i i 



mi S H S I 

I W II fl'M ft I 



What physical Are you a 
' activity do you do vegetarian? 
on a regular basis? 

CD Yes 

CD Running CD No 

CD Swimming 

CD Biking 

CD Gymnastics 

CD Other 

CD None 



If female; 

How many | 
times hoverfn 
you been « | » 
pregnant? ) 



How many 
times did 
you give 
birth? 



DD 
C33 



CZ3 
CS3 



In which state 
do you live? 



i a no i 



t ii p i 



To the best of your knowledge, 
Fother Mother 
CD CD 



what is the Ethnic Origin of your 



Caucasian (please mark specific geographic area below rf known) 
Northern Europe (Austria, Denmark, Finland. France, Germany. Netherlands. Norway, Sweden. Switzerland, UK) 
Southern Europe (Greece, Italy, Spain, Turkey) 
Eastern EuropefCzechcslovakia, Hungary, Poland, Russia, Yugoslavia) 
Middle Eastern (Israel, Egypt, Iran, Iraq, Jordan, Syria. Other Arab States) 

Other 

Don't know 



How long 
have you 
bred 
there? 



Years 



i p H rj i 



mm 

ICO 



CCDCSj 

coco 
mm 



How many 
( have^, 
smoking? 

Years" 



years have you 
been s 



□DCC 

mm 

pnr*r 



mm 
mm 



If yes, how How many cigarettes 
many years do/did you smoke per 
ago? day? 



If yes, for 
how long? 



Did you curt 
smoking? 

CD Yes 
CD No 
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Do you have lung 
Emphysema? 

CD Yes 
CD No 



Years 



r rjnrrr i 

cacd 
mm 



mm 

rRnrm 
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on back 
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Whet is your highest grade 
you completed in school? 
a less tnen 8th grade 
CD8:h.9th.10trLornth grade 
CDhign school graduate or 

ectivolency 
CD some college, 2yr degree 
CD college grodunte.4yr degree 
CD post graduate education ar 

degree 



Mother Deceased? Cause of Death Mother: Father Deceased? Cause of Death Father 




CZ3 Heart Disease 
] Cancer 
3 Stroke 
CD Accident 
1 Suicide 

I Other, 




If Yes at 
what age? 



CD Heart Disease 
CD Cancer 
a Stroke 
CD Accident 
1 Suicide 

I Other, 



[Heo th Information 



Hove you or hos anyone in your immediate family (parents,brothers.sistcrs T or your children) had the following? 
Work afl that apply! 
Ctseax 
heart 0i3coae 



Stroke 



Hyaertenston 



blood clots 

Otcbctes. nsuhr. dependent 



Lactates, no: tnsjfen-dependent 



Cancer 



Uno&aronshus 
or easts 



Prostate _ 



CotonJfcfieetuTi 
San 



Lvmpnomotejeokemio 



Other, piecse specify below: 



Alznenws Disecse 



Ecit*psy 



Scfwuphrsnio 



Bipolar disorder (manic depression) 



Mojor depresacn 

Chronic Wlommatory or Autoimmune Disease including 
Multiple Sclerosis and Rheumatoid Arthritis 



EmpnysericT" 



Asthma 



Other, plecse specify below: 



You 


Mother 


Father 


Sister 


Brother 


Child 


czi 


CD 


CD 


CD 


CD 


CD 


cd 


CD 


CD 


CD 


CD 


CD 


□ 


CD 


CD 


CD 


CD 


CD 


cd 


CD 


CD 


CD 


CD 


CD 


cd 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 




CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 
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CD 
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CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 


CD 



Do you take prescription drugs on a regular basis? 



If yea, please specify below: 



CD Yes CD No 



Have you ever donated 
blood before? 



JYes 



I No 



Hove you been hospitalized 
in the post 5 years for more 
then 6 days at a time? 
CD Yea CD No 

If yea, how many times? 



For each hospitalization 
(if not the same) 
how long did you stay 
and for what reason? 



1) Weeks: CD CDCDCSJCS3 a%) 
CD Acute disorder, including infection and thrombosis 
CD Chronic Disorder 

CD Accident 

CD Other. 

2) Weeks: CD CD CD CD CD CE3 

CD Acute disorder, including Infection and thrombosis 

CD Chronic Disorder 

CD Accident 

CD Other 

3) Weeks: CDCDCnt3DCDCE3 

CD Acute disorder, including infection and thrombosis 

CD Chronic Disorder 

CD Accident 

CD Other. 



If yes, how 
many times: 



Number of Times 




Do you drink any kind of alcoholic 
CD Never 

CD Less than 3 times per week 
CD Daily 

Additional health information details you would like to provide: 



beverage? 
CD Hardly ever 

CD 3 or more times per week 
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331 



PROBABILTY OF GG EXISTING: 
P(GG) = P(G) * P(1-C) 
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= 72% 



PROBABILTY OF CC EXISTING: 
P(CC) = P(C) * P(1-G) 

= 20% * (100%-90%) 

= 20% * 10% 

= 2% 



333 



PROBABILTY OF GC EXISTING: 
P(GC) = P(G) * P(C) 

= 90% * 20% 

= 18% 
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SEQUENCE LISTING 

<110> SEQUENOM 

Braun et al. 



<120> METHODS FOR GENERATING DATABASES AND DATABASES FOR IDENTIFYING 
POLYMORPHIC GENETIC MARKERS 



<13C> 21736-2033PC 

<14 0 > Hot Yet Assigned 
<14I> 2000-10-13 

<150> 60/217,658 
<151> 2000-07-10 

<150> 60/159,176 
<151> 199? 10-13 

<150> 60/217,251 
<151> 2000-07-10 

<150> 0:\'663,968 
<151> 2000-09-19 

<160> 11B 

<170> FasiSHO for Windows Version 4.0 

<210> 1 

<211> 361 

<212> DI I A 

<213> Homo Sapien 

<400> 1 

ctgaggacct ggtcctctga ctgctctttt cacccatcta cagtccccct tgccgtccca 60 

agcaatggat gatttgatgc tgtccccgga cgatattgaa caatggttca ctgaagaccc 120 

aggtccagat gaagctccca gaatgccaga ggctgctccc cgcgtggccc ctgcaccagc 180 

agctcctaca ccggcggccc ctgcaccagc cccctcctgg cccctgtcat cttctgtccc 240 

ttcccagaaa acctaccagg gcagctacgg tttccgtctg ggcttcttgc attctgggac 300 

agccaagtct gtgacttgca cggtcagttg ccctgagggg ctggcttcca tgagacttca 360 

a 361 

<210> 2 
<211> 44 
<212> DKA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide Primer 
<400> 2 

cccagtcacg acgttgtaaa acgctgagga cctggtcctc tgac 44 

<210> 3 
<211> 42 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide Primer 
<400> 3 

agcggataac aatttcacac aggttgaagt ctcatggaag cc 42 

<210> 4 
<211> 17 
<212> DNA 
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<213> Artificial Sequence 



<220> 

<223> Probe 



<400> 4 

gccagaggct gctcccc 

<210> 5 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
O20> 

<223> Probe 



<1G0> 5 

gccagaggct gctcccc 17 

<210> 6 
<Zll> 19 

212 > DNA 
v.2 1 3 > Artificial Sequence 

<220> 

< 2 2 3 > Probe 
<400> 6 

gccagaggct gctccccgc 19 

<210> 7 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Probe 



<400> 7 

gccagaggct gctccccc 18 

<210> 8 

<211> 161 

<212> DNA 

<213> Homo Sapien 



<400> 8 

gtccgtcaga acccatgcgg cagcaaggcc tgccgccgcc tcttcggccc agtggacagc 60 

gagcagctga gccgcgactg tgatgcgcta atggcgggct gcatccagga ggcccgtgag 12 0 

cgatggaact tcgactttgt caccgagaca ccactggagg g 161 

<210> 9 
<211> 43 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide Primer 
<400> 9 
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cccagtcacg acgttgtaaa acggtccgtc agaacccatg egg 43 

<210> 10 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide Primer 
<400> 10 

ageggataac aatttcacac aggctccagt ggtgtctcgg tgac 44 

<210> 11 
<211> 15 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide Primer 
<400> 11 

cagegagcag ctgag 15 

<210> 12 
<211> 15 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Probe 
<400> 12 

cagegagcag ctgag 15 

<210> 13 
<211> 16 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Probe 
<400> 13 

cagegagcag ctgagc 16 

<210> 14 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Probe 
<400> 14 

cagegagcag ctgagac 17 

<210> 15 
<211> 205 
<212> DNA 
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< 2 1 3 > Homo S ap i en 



<400> 15 



gcgctccatt catctcttca tcgactctct gttgaatgaa gaaaatccaa gtaaggccta 

caggtgcagt tccaaggaag cctttgagaa agggctctgc ttgagttgta gaaagaaccg 

ctgcaacaat ctgggctatg agatcaataa agtcagagcc aaaagaagca gcaaaatgta 
cctgaagact cgttctcaga tgccc 



60 
120 
180 
205 



<210> 16 
<211> 42 
<212> DKA 



<213> Artificial Sequence 



<220> 

<223> Oligonucleotide Primers 



<400> 16 

cccagtcacg acgttgtaaa acggcgctcc attcatctct tc 



42 



<210> 17 

<211> 42 

<212> D!;A 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide Primer 

<400> 17 

agcggataac aatttcacac agggggcatc tgagaacgag tc 42 

<210> 18 
<211> 20 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide Primer 
<400> IB 

caatctgggc tatgagatca 20 

<210> 19 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Probe 
<400> 19 

caatctgggc tatgagatca 20 

<210> 20 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Probe 
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<400> 20 

caatctgggc tatgagatca a 21 

<210> 21 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> Probe 
<400> 21 

caacctgggc tatgagatca gt 20 

<210> 22 

<211> 60 

<212> DNA 

<213> Homo Sapien 

<220> 

<223> Probe 
<400> 22 

gtgccggcta ctcggatggc agcaaggact cctgcaaggg ggacagtgga ggcccacatg 60 

<210> 23 

<211> 60 

<212> DNA 

<213> Homo sapien 

<400> 23 

ccacccacta ccggggcacg tggtacctga cgggcatcgt cagctggggc cagggctgcg 60 

<210> 24 
<211> 42 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 24 

cccagtcacg acgttgtaaa acgatggcag caaggactcc tg 42 

<210> 25 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 25 

cacatgccac ccactacc 18 

<210> 26 
<211> 43 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Oligonucleotide primer 
<400> 26 

agcggataac aatttcacac aggtgacgat gcccgtcagg tac 43 

<210> 27 
<211> 15 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Probe 
<400> 27 

atgccaccca ctacc 15 

<210> 28 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Probe 
<400> 28 

cacatgccac ccactaccg 19 

<210> 29 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Probe 
<400> 29 

cacatgccac ccactaccag 20 

<210> 30 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Probe 
<400> 30 

agcggataac aatttcacac agg 23 

<210> 31 

<211> 2363 

<212> DNA 

<213> Homo Sapien 

<220> 
<221> CDS 

<222> (138) . . . (2126) 
<223> AKAP-10 
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<300> 

<:308> GenBank AF037439 
<309> 1997-12-21 

<400> 31 

gcggcttgtt gataatatgg cggctggagc tgcctgggca tcccgaggag gcggtggggc 60 
ccactcccgg aagaagggtc ccttttcgcg ctagtgcagc ggcccctctg gacccggaag 120 
tccgggccgg ttgctga atg agg gga gcc ggg ccc tec ccg cgc cag tec 170 

Met Arg Gly Ala Gly Pro Ser Pro Arg Gin Ser 
1 5 10 

ccc cgc acc etc cgt ccc gac ccg ggc ccc gcc atg tec ttc ttc egg 218 
Pro Arg Thr Leu Arg Pro Asp Pro Gly Pro Ala Met Ser Phe Phe Arg 
15 20 25 

egg aaa gtg aaa ggc aaa gaa caa gag aag acc tea gat gtg aag tec 266 
Arg Lys Val Lys Gly Lys Glu Gin Glu Lys Thr Ser Asp Val Lys Ser 
30 35 40 

att aaa get tea ata tec gta cat tec cca caa aaa age act aaa aat 314 
lie Lys Ala Ser lie Ser Val His Ser Pro Gin Lys Ser Thr Lys Asn 
45 50 55 

cat gcc ttg ctg gag get gca gga cca agt cat gtt gca ate aat gcc 362 
His Ala Leu Leu Glu Ala Ala Gly Pro Ser His Val Ala lie Asn Ala 
60 65 70 75 

att tct gcc aac atg gac tec ttt tea agt age agg aca gcc aca ctt 410 
lie Ser Ala Asn Met Asp Ser Phe Ser Ser Ser Arg Thr Ala Thr Leu 
80 85 90 

aag aag cag cca age cac atg gag get get cat ttt ggt gac ctg ggc 45 8 

Lys Lys Gin Pro Ser His Met Glu Ala Ala His Phe Gly Asp Leu Gly 
95 100 105 

aga tct tgt ctg gac tac cag act caa gag acc aaa tea age ctt tct 506 
Arg Ser Cys Leu Asp Tyr Gin Thr Gin Glu Thr Lys Ser Ser Leu Ser 
110 115 120 

aag acc ctt gaa caa gtc ttg cac gac act att gtc etc cct tac ttc 554 
Lys Thr Leu Glu Gin Val Leu His Asp Thr He Val Leu Pro Tyr Phe 
125 130 135 



att caa ttc atg gaa ctt egg cga atg gag cat ttg gtg aaa ttt tgg 
He Gin Phe Met Glu Leu Arg Arg Met Glu His Leu Val Lys Phe Trp 
140 145 150 155 



602 



tta gag get gaa agt ttt cat tea aca act tgg teg cga ata aga gca 650 

Leu Glu Ala Glu Ser Phe His Ser Thr Thr Trp Ser Arg lie Arg Ala 

160 165 170 

cac agt eta aac aca atg aag cag age tea ctg get gag cct gtc tct 698 

His Ser Leu Asn Thr Met Lys Gin Ser Ser Leu Ala Glu Pro Val Ser 

175 180 185 

cca tct aaa aag cat gaa act aca gcg tct ttt tta act gat tct ctt 746 

Pro Ser Lys Lys His Glu Thr Thr Ala Ser Phe Leu Thr Asp Ser Leu 

190 195 200 

gat aag aga ttg gag gat tct ggc tea gca cag ttg ttt atg act cat 794 
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Asp Lys Arg Leu Glu Asp Ser Gly Ser Ala Gin Leu Phe Met Thr His 
205 210 215 

tea gaa gga att gac ctg aat aat aga act aac age act cag aat cac 842 
Ser Glu Gly lie Asp Leu Asn Asn Arg Thr Asn Ser Thr Gin Asn His 
220 225 230 235 

ttg ctg ctt tec cag gaa tgt gac agt gec cat tct etc cgt ctt gaa 890 
Leu Leu Leu Ser Gin Glu Cys Asp Ser Ala His Ser Leu Arg Leu Glu 
240 245 250 

atg gec aga gca gga act cac caa gtt tec atg gaa acc caa gaa tct 93 8 

Met Ala Arg Ala Gly Thr His Gin Val Ser Met Glu Thr Gin Glu Ser 
255 260 265 

tec tct aca ctt aca gta gec agt aga aat agt ccc get tct cca eta 986 
Ser Scr Thr Leu Thr Val Ala Ser Arg Asn Ser Pro Ala Ser Pro Leu 
270 275 280 

aaa gaa ttg tea gga aaa eta atg aaa agt ata gaa caa gat gca gtg 1034 
Lys Glu Leu Ser Gly Lys Leu Met Lys Ser lie Glu Gin Asp Ala Val 
285 290 295 

aat act ttt acc aaa tat ata tct cca gat get get aaa cca ata cca 1082 
Asn Thr Phe Thr Lys Tyr lie Ser Pro Asp Ala Ala Lys Pro lie Pro 
300 305 310 315 

att aca gaa gca atg aga aat gac ate ata gca agg att tgt gga gaa 113 0 

lie Thr Glu Ala Met Arg Asn Asp lie lie Ala Arg lie Cys Gly Glu 
320 325 330 

gat gga cag gtg gat ccc aac tgt ttc gtt ttg gca cag tec ata gtc 1178 
Asp Gly Gin Val Asp Pro Asn Cys Phe Val Leu Ala Gin Ser lie Val 
335 340 345 

ttt agt gca atg gag caa gag cac ttt agt gag ttt ctg cga agt cac 1226 
Phe Ser Ala Met Glu Gin Glu His Phe Ser Glu Phe Leu Arg Ser His 
350 355 360 

cat ttc tgt aaa tac cag att gaa gtg ctg acc agt gga act gtt tac 1274 
His Phe Cys Lys Tyr Gin lie Glu Val Leu Thr Ser Gly Thr Val Tyr 
365 370 375 

ctg get gac att etc ttc tgt gag tea gee etc ttt tat ttc tct gag 1322 
Leu Ala Asp lie Leu Phe Cys Glu Ser Ala Leu Phe Tyr Phe Ser Glu 
380 385 390 395 

tac atg gaa aaa gag gat gca gtg aat ate tta caa ttc tgg ttg gca 1370 
Tyr Met Glu Lys Glu Asp Ala Val Asn lie Leu Gin Phe Trp Leu Ala 
400 405 410 

gca gat aac ttc cag tct cag ctt get gee aaa aag ggg caa tat gat 1418 
Ala Asp Asn Phe Gin Ser Gin Leu Ala Ala Lys Lys Gly Gin Tyr Asp 
415 420 425 

gga cag gag gca cag aat gat gee atg att tta tat gac aag tac ttc 1466 
Gly Gin Glu Ala Gin Asn Asp Ala Met He Leu Tyr Asp Lys Tyr Phe 
430 435 440 

tec etc caa gec aca cat cct ctt gga ttt gat gat gtt gta cga tta 1514 
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Ser Leu Gin Ala Thr His Pro Leu Gly Phe Asp Asp Val Val Arg Leu 
445 450 455 

gaa att gaa tec aat ate tgc agg gaa ggt ggg cca etc ccc aac tgt 1562 
Glu lie Glu Ser Asn lie Cys Arg Glu Gly Gly Pro Leu Pro Asn Cys 
460 465 470 475 

t:c aca act cca tta cgt cag gec tgg aca acc atg gag aag gtc ttt 1610 
Phe Thr Thr Pro Leu Arg Gin Ala Trp Thr Thr Met Glu Lys Val Phe 
480 485 490 

tea cct ggc ttt ctg tec age aat ctt tat tat aaa tat ttg aat gat 1658 
I .c u Pro Gly Phe Leu Ser Ser Asn Leu Tyr Tyr Lys Tyr Leu Asn Asp 
495 500 505 

etc at: rat teg gtt cga gga gat gaa ttt ctg ggc ggg aac gtg teg 1706 
Leu He Hie Ser Val Arg Gly Asp Glu Phe Leu Gly Gly Asn Val Ser 
L10 515 520 

ccg art art cct ggc tct gtt ggc cct cct gat gag tct cac cca ggg 1754 
Pro Thr Ala Pro Gly Ser Val Gly Pro Pro Asp Glu Ser His Pro Gly 

SI 1 .- 530 535 

agt tct ^ac age tct gcg tct cag tec agt gtg aaa aaa gee agt att 1802 
Ser S~r Anp Ser Ser Ala Ser Gin Ser Ser Val Lys Lys Ala Ser lie 
540 545 550 555 

aaa at<± ctg aaa aat ttt gat gaa gcg ata att gtg gat gcg gca agt 1850 
Lys lie Leu Lys Asn Phe Asp Glu Ala lie lie Val Asp Ala Ala Ser 
560 565 570 

ctg gat cca gaa tct tta tat caa egg aca tat gee ggg aag atg aca 1898 
Leu Asp I'ro Glu Ser Leu Tyr Gin Arg Thr Tyr Ala Gly Lys Met Thr 
575 580 585 

ttt gga aga gtg agt gac ttg ggg caa ttc ate egg gaa tct gag cct 1946 
Phe Gly Arg Val Ser Asp Leu Gly Gin Phe lie Arg Glu Ser Glu Pro 
590 595 600 

gaa cct gat gta agg aaa tea aaa gga tec atg ttc tea caa get atg 1994 
Glu Pro Asp Val Arg Lys Ser Lys Gly Ser Met Phe Ser Gin Ala Met 
605 610 615 

aag aaa tgg gtg caa gga aat act gat gag gee cag gaa gag eta get 2 042 

Lys Lys Trp Val Gin Gly Asn Thr Asp Glu Ala Gin Glu Glu Leu Ala 
620 625 630 635 

tgg aag att get aaa atg ata gtc agt gac att atg cag cag get cag 2 090 

Trp Lys He Ala Lys Met He Val Ser Asp He Met Gin Gin Ala Gin 
640 645 650 

tat gat caa ccg tta gag aaa tct aca aag tta tga ctcaaaactt 2136 
Tyr Asp Gin Pro Leu Glu Lys Ser Thr Lys Leu * 
655 660 

gagataaagg aaatctgett gtgaaaaata agagaacttt tttcccttgg ttggattctt 2196 

caacacagcc aatgaaaaca gcactatatt tctgatctgt cactgttgtt tccagggaga 2256 

gaatggggag acaatcctag gacttccacc etaatgeagt tacctgtagg gcataattgg 2316 

atggcacatg atgtttcaca cagtgaggag tctttaaagg ttaccaa 2363 
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<210> 32 

<211> 662 

<212> PRT 

<213> Homo Sapien 

<400> 32 

Met Arg Gly Ala Gly Pro Ser Pro Arg Gin Ser Pro Arg Thr Leu Arg 

15 10 15 

Pro Asp Pro Gly Pro Ala Met Ser Phe Phe Arg Arg Lys Val Lys Gly 

20 25 30 

Lys Glu Gin Glu Lys Thr Ser Asp Val Lys Ser lie Lys Ala Ser lie 

35 40 45 

Ser Val His Ser Pro Gin Lys Ser Thr Lys Asn His Ala Leu Leu Glu 

50 55 60 

Ala Ala Gly Pro Ser His Val Ala lie Asn Ala He Ser Ala Asn Met 
65 70 75 80 

Asp Ser Phe Ser Ser Ser Arg Thr Ala Thr Leu Lys Lys Gin Pro Ser 

85 90 * 95 

His Met Glu Ala Ala His Phe Gly Asp Leu Gly Arg Ser Cys Leu Asp 

100 105 110 

Tyr Gin Thr Gin Glu Thr Lys Ser Ser Leu Ser Lys Thr Leu Glu Gin 

115 120 125 

Val Leu His Asp Thr He Val Leu Pro Tyr Phe He Gin Phe Met Glu 

130 135 140 

Leu Arg Arg Met Glu His Leu Val Lys Phe Trp Leu Glu Ala Glu Ser 
145 150 155 160. 

Phe His Ser Thr Thr Trp Ser Arg v Ile Arg Ala His Ser Leu Asn Thr 

165 170 175 

Met Lys Gin Ser Ser Leu Ala Glu Pro Val Ser Pro Ser Lys Lys His 

180 185 190 

Glu Thr Thr Ala Ser Phe Leu Thr Asp Ser Leu Asp Lys Arg Leu Glu 

195 200 205 

Asp Ser Gly Ser Ala Gin Leu Phe Met Thr His Ser Glu Gly He Asp 

210 215 220 

Leu Asn Asn Arg Thr Asn Ser Thr Gin Asn His Leu Leu Leu Ser Gin 
'225 230 235 240 

Glu Cys Asp Ser Ala His Ser Leu Arg Leu Glu Met Ala Arg Ala Gly 

245 250 255 

Thr His Gin Val Ser Met Glu Thr Gin Glu Ser Ser Ser Thr Leu Thr 

260 265 270 

Val Ala Ser Arg Asn Ser Pro Ala Ser Pro Leu Lys Glu Leu Ser Gly 

275 280 285 

Lys Leu Met Lys Ser He Glu Gin Asp Ala Val Asn Thr Phe Thr Lys 

290 295 300 

Tyr He Ser Pro Asp Ala Ala Lys Pro He Pro He Thr Glu Ala Met 
305 310 315 320 

Arg Asn Asp He He Ala Arg He Cys Gly Glu Asp Gly Gin Val Asp 

325 330 335 

Pro Asn Cys Phe Val Leu Ala Gin Ser He Val Phe Ser Ala Met Glu 

340 345 350 

Gin Glu His Phe Ser Glu Phe Leu Arg Ser His His Phe Cys Lys Tyr 

355 360 365 

Gin He Glu Val Leu Thr Ser Gly Thr Val Tyr Leu Ala Asp He Leu 

370 375 380 

Phe Cys Glu Ser Ala Leu Phe Tyr Phe Ser Glu Tyr Met Glu Lys Glu 
385 390 395 400 

Asp Ala Val Asn He Leu Gin Phe Trp Leu Ala Ala Asp Asn Phe Gin 

405 410 415 

Ser Gin Leu Ala Ala Lys Lys Gly Gin Tyr Asp Gly Gin Glu Ala Gin 
420 425 430 
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Asn 


Asp 


Ala 
435 


Met 


He 


Leu 


Tyr 


Asp 
440 


Lys 


Tyr 


Phe 


Ser 


Leu 
445 


Gin 


Ala 


Thr 


His 


Pro 
450 


Leu 


Gly 


Phe 


Asp 


Asp 
455 


Val 


Val 


Arg 


Leu 


Glu 
460 


He 


Glu 


Ser 


Asn 


lie 


Cys 


Arq 


Glu 


Gly 


Gly 


Pro 


Leu 


Pro 


Asn 


Cys 


Phe 


Thr 


Thr 


Pro 


Leu 


465 










470 










475 










480 


Arg 


Gin 


Ala 


Trp 


Thr 
485 


Thr 


Met 


Glu 


Lys 


Val 
490 


Phe 


Leu 


Pro 


Gly 


Phe 
495 


Leu 


Ser 


Ser 


Asn 


Leu 
500 


Tyr 


Tyr 


Lys 


Tyr 


Leu 
505 


Asn 


Asp 


Leu 


He 


His 
510 


Ser 


Val 


Arc? 


Gly 


Asp 

IT 


Glu 


Phe 


Leu Gly Gly Asn 


Val 


Ser 


Pro 


Thr 


Ala 


Pro 


Gly 






515 










520 










525 








Ser 


Val 


Gly 


Pro 


Pro 


Asp 


Glu 


Ser 


His 


Pro 


Gly 


Ser 


Ser 


Asp 


Ser 


Ser 




530 








535 










540 










Ala 


Ser 


Gin 


Ser 


Ser 


Val 


Lys 


Lys 


Ala 


Ser 


He 


Lvs 


He 


Leu 


Lys 


Asn 


545 










550 










555 










560 


Phe 


Asp 


Glu 


Ala 


He 


He 


Val 


Asp 


Ala 


Ala 


Ser 


Leu 


Asp 


Pro 


Glu 


Ser 








565 










570 










575 




Leu 


Tyr 


Gin 


Arg 
580 


Thr 


Tyr 


Ala 


Gly 


Lys 
585 


Met 


Thr 


Phe 


Gly 


Arg 
590 


Val 


Ser 


Asp 


Leu 


Gly 
595 


Gin 


Phe 


He 


Arg 


Glu 
600 


Ser 


Glu 


Pro 


Glu 


Pro 
605 


Asp 


Val 


Arg 


Lys 


Ser 
610 


Lys 


Gly 


Ser 


Met 


Phe 
615 


Ser 


Gin 


Ala 


Met 


Lys 
620 


Lys 


Trp 


Val 


Gin 


Gly 


Asn 


Thr 


Asp 


Glu 


Ala 


Gin 


Glu 


Glu 


Leu 


Ala 


Trp 


Lys 


He 


Ala 


Lys 


625 










630 










635 










640 


Met 


He 


Val 


Ser 


Asp 
645 


He 


Met 


Gin 


Gin 


Ala 
650 


Gin 


Tyr 


Asp 


Gin 


Pro 
655 


Leu 


Glu 


Lys 


Ser 


Thr 
660 


Lys 


Leu 























<210> 33 

<211> 2363 

<212> DNA 

<213> Homo Sapien 

<220> 
<221> CDS 

<222> (138) . . . (2126) 
<223> AKAP-10-5 

<221> allele 
<222> 2073 

<221> Single Nucleotide Polymorphism: A to G 
<400> 33 

gcggcttgtt gataatatgg cggctggagc tgcctgggca tcccgaggag gcggtggggc 60 
ccactcccgg aagaagggtc ccttttcgcg ctagtgcagc ggcccctctg gacccggaag 12 0 
tccgggccgg ttgctga atg agg gga gcc ggg ccc tec ccg cgc cag tec 170 

Met Arg Gly Ala Gly Pro Ser Pro Arg Gin Ser 
15 10 

ccc cgc acc etc cgt ccc gac ccg ggc ccc gcc atg tec ttc ttc egg 218 
Pro Arg Thr Leu Arg Pro Asp Pro Gly Pro Ala Met Ser Phe Phe Arg 
15 20 25 

egg aaa gtg aaa ggc aaa gaa caa gag aag acc tea gat gtg aag tec 266 
Arg Lys Val Lys Gly Lys Glu Gin Glu Lys Thr Ser Asp Val Lys Ser 
30 35 40 
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ate aaa get tea ata tec gta cat tec cca caa aaa age act aaa aat 314 

He Lys Ala Ser He Ser Val His Ser Pro Gin Lys Ser Thr Lys Asn 

•15 50 55 

cat gec ttg ctg gag get gca gga cca agt cat gtt gca ate aat gec 3 62 

His Ala Leu Leu Glu Ala Ala Gly Pro Ser His Val Ala He Asn Ala 
60 65 70 75 

att tct gee aac atg gac tec ttt tea agt age agg aca gec aca ctt 410 
Her Ser Ala Asn Met Asp Ser Phe Ser Ser Ser Arg Thr Ala Thr Leu 
80 85 90 

aag nan cag cca age cac atg gag get get cat ttt ggt gac ctg ggc 458 
Lvs; Lvii Gin Pro Ser His Met Glu Ala Ala His Phe Gly Asp Leu Gly 
95 100 105 

an* tct tgt ctg gac tac cag act caa gag ace aaa tea age ctt tct 506 
Arvj S'_r Cvs Leu Asp Tyr Gin Thr Gin Glu Thr Lys Ser Ser Leu Ser 
1*10 115 120 

aag arc ctt gaa caa gtc ttg cac gac act att gtc etc cct tac ttc 554 
Lvc Tr.r Leu Glu Gin Val Leu His Asp Thr He Val Leu Pro Tyr Phe 
US 130 135 

att c^^t ttc atg gaa ctt egg cga atg gag cat ttg gtg aaa ttt tgg 602 
He CI:: Pne Met Glu Leu Arg Arg Met Glu His Leu Val Lys Phe Trp 
140 145 150 155 

tta gjg get gaa agt ttt cat tea aca act tgg teg cga ata aga gca 650 
Leu Giu Ala Glu Ser Phe His Ser Thr Thr Trp Ser Arg He Arg Ala 
160 165 170 

cac agt eta aac aca atg aag cag age tea ctg get gag cct gtc tct 698 
His Ser Leu Asn Thr Met Lys Gin Ser Ser Leu Ala Glu Pro Val Ser 
175 180 185 

cca tct aaa aag cat gaa act aca gcg tct ttt tta act gat tct ctt 746 
Pro Ser Lys Lys His Glu Thr Thr Ala Ser Phe Leu Thr Asp Ser Leu 
190 195 200 

gat aag aga ttg gag gat tct ggc tea gca cag ttg ttt atg act cat 794 
Asp Lys Arg Leu Glu Asp Ser Gly Ser Ala Gin Leu Phe Met Thr His 
205 210 215 

tea gaa gga att gac ctg aat aat aga act aac age act cag aat cac 842 
Ser Glu Gly He Asp Leu Asn Asn Arg Thr Asn Ser Thr Gin Asn His 
220 225 230 235 

ttg ctg ctt tec cag gaa tgt gac agt gee cat tct etc cgt ctt gaa 890 
Leu Leu Leu Ser Gin Glu Cys Asp Ser Ala His Ser Leu Arg Leu Glu 
240 245 250 

atg gee aga gca gga act cac caa gtt tec atg gaa acc caa gaa tct 938 
Met Ala Arg Ala Gly Thr His Gin Val Ser Met Glu Thr Gin Glu Ser 
255 260 265 

tec tct aca ctt aca gta gec agt aga aat agt ccc get tct cca eta 986 
Ser Ser Thr Leu Thr Val Ala Ser Arg Asn Ser Pro Ala Ser Pro Leu 
270 275 280 
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aaa gaa ttg tea gga aaa eta atg aaa agt ata gaa caa gat gca gtg 1034 
Lys Glu Leu Ser Gly Lys Leu Met Lys Ser lie Glu Gin Asp Ala Val 
285 290 295 

aat act ttt ace aaa tat ata tct cca gat get get aaa cca ata cca 1082 
Asn Thr Phe Thr Lys Tyr lie Ser Pro Asp Ala Ala Lys Pro lie Pro 
300 305 310 315 

att aca gaa gca atg aga aat gac ate ata gca agg att tgt gga gaa 113 0 

lie Thr Glu Ala Met Arg Asn Asp lie lie Ala Arg lie Cys Gly Glu 
320 325 330 

gat gga cag gtg gat ccc aac tgt ttc gtt ttg gca cag tec ata gtc 117 8 

Asp Gly Gin Val Asp Pro Asn Cys Phe Val Leu Ala Gin Ser lie Val 
335 340 345 

ttt agt gca atg gag caa gag cac ttt agt gag ttt ctg cga agt cac 1226 
Phe Ser Ala Met Glu Gin Glu His Phe Ser Glu Phe Leu Arg Ser His 
350 355 360 

cat ttc tgt aaa tac cag att gaa gtg ctg acc agt gga act gtt tac 1274 
!iir. Phe Cys Lys Tyr Gin lie Glu Val Leu Thr Ser Gly Thr Val Tyr 
365 370 375 

ctg net gac att etc ttc tgt gag tea gee etc ttt tat ttc tct gag 1322 
Leu Ala Asp lie Leu Phe Cys Glu Ser Ala Leu Phe Tyr Phe Ser Glu 
3fc0 385 390 395 

tac atg gaa aaa gag gat gca gtg aat ate tta caa ttc tgg ttg gca 137 0 

Tyr Ket Glu Lys Glu Asp Ala Val Asn lie Leu Gin Phe Trp Leu Ala 
400 405 410 

gca gat aac ttc cag tct cag ctt get gee aaa aag ggg caa tat gat 1418 
Ala Asp Asn Phe Gin Ser Gin Leu Ala Ala Lys Lys Gly Gin Tyr Asp 
415 420 425 ' 

gga cag gag gca cag aat gat gee atg att tta tat gac aag tac ttc 1466 
Gly Gin Glu Ala Gin Asn Asp Ala Met lie Leu Tyr Asp Lys Tyr Phe 
430 435 440 

tec etc caa gee aca cat cct ctt gga ttt gat gat gtt gta cga tta 1514 
Ser Leu Gin Ala Thr His Pro Leu Gly Phe Asp Asp Val Val Arg Leu 
445 450 455 

gaa att gaa tec aat ate tgc agg gaa ggt ggg cca etc ccc aac tgt 1562 
Glu lie Glu Ser Asn lie Cys Arg Glu Gly Gly Pro Leu Pro Asn Cys 
460 465 470 475 

ttc aca act cca tta cgt cag gee tgg aca acc atg gag aag gtc ttt 1610 
Phe Thr Thr Pro Leu Arg Gin Ala Trp Thr Thr Met Glu Lys Val Phe 
480 485 490 

ttg cct ggc ttt ctg tec age aat ctt tat tat aaa tat ttg aat gat 1658 
Leu Pro Gly Phe Leu Ser Ser Asn Leu Tyr Tyr Lys Tyr Leu Asn Asp 
495 500 505 

etc ate cat teg gtt cga gga gat gaa ttt ctg ggc ggg aac gtg teg 1706 
Leu lie His Ser Val Arg Gly Asp Glu Phe Leu Gly Gly Asn Val Ser 
510 515 520 
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ccg act get cct ggc tct gtt ggc cct cct gat gag tct cac cca ggg 1754 

Pro Thr Ala Pro Gly Ser Val Gly Pro Pro Asp Glu Ser His Pro Gly 

525 530 535 

agt tct gac age tct gcg tct cag tec agt gtg aaa aaa gee agt att 18 02 

Ser Ser Asp Ser Ser Ala Ser Gin Ser Ser Val Lys Lys Ala Ser lie 
540 545 550 555 

aaa ata ctg aaa aat ttt gat gaa gcg ata att gtg gat gcg gca agt 1850 
Lys lie Leu Lys Asn Phe Asp Glu Ala lie lie Val Asp Ala Ala Ser 
560 565 570 

ctg gat cca gaa tct tta tat caa egg aca tat gec ggg aag atg aca 1898 
Leu Asp Pro Glu Ser Leu Tyr Gin Arg Thr Tyr Ala Gly Lys Met Thr 
575 580 585 

ttt gga aga gtg agt gac ttg ggg caa ttc ate egg gaa tct gag cct 1946 
Phe Gly Arg Val Ser Asp Leu Gly Gin Phe lie Arg Glu Ser Glu Pro 
590 595 600 

gaa cct gat gta agg aaa tea aaa gga tec atg ttc tea caa get atg 1994 
Glu Pro Asp Val Arg Lys Ser Lys Gly Ser Met Phe Ser Gin Ala Met 
605 610 615 

aag aaa tgg gtg caa gga aat act gat gag gee cag gaa gag eta get 2042 
Lys Lys Trp Val Gin Gly Asn Thr Asp Glu Ala Gin Glu Glu Leu Ala 
620 625 630 635 

tgg aag att get aaa atg ata gtc agt gac gtt atg cag cag get cag 2090 
Trp Lys lie Ala Lys Met lie Val Ser Asp Val Met Gin Gin Ala Gin 
640 645 650 

tat gat caa ccg tta gag aaa tct aca aag tta tga ctcaaaactt 2136 
Tyr Asp Gin Pro Leu Glu Lys Ser Thr Lys Leu * 
655 660 

gagataaagg aaatctgett gtgaaaaata agagaacttt tttcccttgg ttggattctt 2196 

caacacagcc aatgaaaaca gcactatatt tctgatctgt cactgttgtt tccagggaga 2256 

gaatggggag acaatcctag gacttccacc etaatgeagt tacctgtagg gcataattgg 2316 

atggcacatg atgtttcaca cagtgaggag tctttaaagg ttaccaa 2363 

<210> 34 

<211> 662 

<212> PRT 

<213> Homo Sapien 

<400> 34 

Met Arg Gly Ala Gly Pro Ser Pro Arg Gin Ser Pro Arg Thr Leu Arg 

15 10 15 

Pro Asp Pro Gly Pro Ala Met Ser Phe Phe Arg Arg Lys Val Lys Gly 

20 25 30 

Lys Glu Gin Glu Lys Thr Ser Asp Val Lys Ser lie Lys Ala Ser lie 

35 40 45 

Ser Val His Ser Pro Gin Lys Ser Thr Lys Asn His Ala Leu Leu Glu 

50 55 60 

Ala Ala Gly Pro Ser His Val Ala lie Asn Ala lie Ser Ala Asn Met 
65 70 75 80 

Asp Ser Phe Ser Ser Ser Arg Thr Ala Thr Leu Lys Lys Gin Pro Ser 

85 90 95 

His Met Glu Ala Ala His Phe Gly Asp Leu Gly Arg Ser Cys Leu Asp 
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100 

Tyr Gin Thx Gin 
115 

Val Leu His Asp 
130 

Leu Arg Arg Met 
145 

Phe His Ser Thr 

Met Lys Gin Ser 
180 

Glu Thr Thr Ala 
195 

Asp Ser Gly Ser 
210 

Leu Asn Asn Arg 
225 

Glu Cys Asp Ser 

Thr His Gin Val 
260 

Val Ala Ser Arg 
275 

Lys Leu Met Lys 
290 

Tyr lie Ser Pro 
305 

Arg Asn Asp lie 

Pro Asn Cys Phe 
340 

Gin Glu His Phe 
355 

Gin lie Glu Val 
370 

Phe Cys Glu Ser 
385 

Asp Ala Val Asn 

Ser Gin Leu Ala 
420 

Asn Asp Ala Met 
435 

His Pro Leu Gly 
450 

lie Cys Arg Glu 
465 

Arg Gin Ala Trp 

Ser Ser Asn Leu 
500 

Arg Gly Asp Glu 
515 

Ser Val Gly Pro 
530 

Ala Ser Gin Ser 
545 

Phe Asp Glu Ala 
Leu Tyr Gin Arg 



Glu Thr Lys Ser 
120 

Thr lie Val Leu 
135 

Glu His Leu Val 
15 0 

Thr Trp Ser Arg 
165 

Ser Leu Ala Glu 

Ser Phe Leu Thr 
200 

Ala Gin Leu Phe 
215 

Thr Asn Ser Thr 
230 

Ala His Ser Leu 
245 

Ser Met Glu Thr 

Asn Ser Pro Ala 
280 

Ser lie Glu Gin 
295 

Asp Ala Ala Lys 
310 

lie Ala Arg lie 
325 

Val Leu Ala Gin 

Ser Glu Phe Leu 
360 

Leu Thr Ser Gly 
375 

Ala Leu Phe Tyr 
390 

lie Leu Gin Phe 
405 

Ala Lys Lys Gly 

lie Leu Tyr Asp 
440 

Phe Asp Asp Val 
455 

Gly Gly Pro Leu 
470 

Thr Thr Met Glu 
485 

Tyr Tyr Lys Tyr 

Phe Leu Gly Gly 
520 

Pro Asp Glu Ser 
535 

Ser Val Lys Lys 
550 

lie lie Val Asp 
565 

Thr Tyr Ala Gly 



105 

Ser Leu Ser Lys 

Pro Tyr Phe lie 
140 

Lys Phe Trp Leu 
155 

lie Arg Ala His 
170 

Pro Val Ser Pro 
185 

Asp Ser Leu Asp 

Met Thr His Ser 
220 

Gin Asn His Leu 
235 

Arg Leu Glu Met 
250 

Gin Glu Ser Ser 
265 

Ser Pro Leu Lys 

Asp Ala Val Asn 
300 

Pro He Pro He 
315 

Cys Gly Glu Asp 
330 

Ser He Val Phe 
345 

Arg Ser His His 

Thr Val Tyr Leu 
380 

Phe Ser Glu Tyr 
395 

Trp Leu Ala Ala 
410 

Gin Tyr Asp Gly 
425 

Lys Tyr Phe Ser 

Val Arg Leu Glu 
460 

Pro Asn Cys Phe 
475 

Lys Val Phe Leu 
490 

Leu Asn Asp Leu 
505 

Asn Val Ser Pro 

His Pro Gly Ser 
54 0 

Ala Ser He Lys 
555 

Ala Ala Ser Leu 
570 

Lys Met Thr Phe 



110 

Thr Leu Glu Gin 
125 

Gin Phe Met Glu 

Glu Ala Glu Ser 
160 

Ser Leu Asn Thr 
175 

Ser Lys Lys His 
190 

Lys Arg Leu Glu 
205 

Glu Gly He Asp 

Leu Leu Ser Gin 
240 

Ala Arg Ala Gly 
255 

Ser Thr Leu Thr 
270 

Glu Leu Ser Gly 
285 

Thr Phe Thr Lys 

Thr Glu Ala Met 
320 

Gly Gin Val Asp 
335 

Ser Ala Met Glu 
350 

Phe Cys Lys Tyr 
365 

Ala Asp He Leu 

Met Glu Lys Glu 
400 

Asp Asn Phe Gin 
415 

Gin Glu Ala Gin 
430 

Leu Gin Ala Thr 
445 

He Glu Ser Asn 

Thr Thr Pro Leu 
480 

Pro Gly Phe Leu 
495 

He His Ser Val 
510 

Thr Ala Pro Gly 
525 

Ser Asp Ser Ser 

He Leu Lys Asn 
560 

Asp Pro Glu Ser 
575 

Gly Arg Val Ser 
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580 585 590 

Asp Leu Gly Gin Phe lie Arg Glu Ser Glu Pro Glu Pro Asp Val Arg 

595 600 605 

Lys Ser Lys Gly Ser Met Phe Ser Gin Ala Met Lys Lys Trp Val Gin 

610 615 620 

Gly Asn Thr Asp Glu Ala Gin Glu Glu Leu Ala Trp Lys lie Ala Lys 
625 630 635 640 

Met lie Val Ser Asp Val Met Gin Gin Ala Gin Tyr Asp Gin Pro Leu 

645 650 655 

Glu Lys Ser Thr Lys Leu 
660 

<210> 35 

<2il> 162025 

<212> DNA 

<212 > Honto Sapien 

<300> 

<308> GeriBank AC005730 
<309> 191*6-10-22 

<400> 35 

gaattcctat ttcaaaagaa acaaatgggc caagtatggt ggctcatacc tgtaatccca 60 

gcactttggg aggccgaggt gagtgggtca cttgaggtca ggagttccag gccagtctgg 120 

ccaacatggt gaaacactgt ctctactaaa aatacaaaaa ttagccgggc gtggtggcgg 180 

gcacctgtaa tcccagctac tcaggaggct gaggcaggag aattgcttga acctgggaga 24 0 

tggaggttgc agtgagccga gatcgcgcca ctgctctcca gcctgggtgg cagagtgaga 300 

ctctgtctca aaaagaaaca aagaaataaa tgaaacaatt ttgttcacat atatttcaca 360 

aatttgaaat gutaaaggta ttatggtcac tgatatcctg tttcattctt tatataatca 420 

ttaagtttga aatgtatact tgcactacta acacagtagt taatcttagt cctacaagtt 480 

actgctttta cacaatatat tttcgtaata tgtatgcact ggtgtttatg tacgtgttta 540 

tgtttatatc tgttaaaatt agcagtttcc atctttttct attttgtacc atcacatcag 600 

ttcagaagga ttgacagagc aaaatgattt gatgaagtat aaaagtcaca tggtgagtgg 660 

cataaataca actctgaaca attaggaggc tcactattga ctggaactaa actgcaagcc 720 

agaaagacac atatcctata tgtcaagaga tgtaccaccc aggcagttaa agaagggaag 780 

tacacataga aagcacaatg gtgaataatt aaaaaattgg aatttatcag acactggatt 84 0 

catttgctcc taaagtcaga gtcctctatt gtttttttgt ttttgtgggt ttctttttaa 900 

atttttttat tttttgtaga gtcggagtct cactgtgtta cccgggctgg tctagaactc 960 

ctggcctcaa acaaacctcc tgcctcagct tcccaaagca ttgggattac agacatgagc 1020 

cactgagccc agcccagacg ctttagcatt tatgaagctt ctgaaatagt tgtagaaacc 1080 

gcataagctt tccatgtcac tttcaaagtt tgatggtctc tttagtaaac caaccaagtt 1140 

attcctcaag ggcaaaataa catttctcag tgcaaaactg atgcacttca ttaccaaaag 1200 

gaaaagacca caactataga ggcgtcattg aaagctgcac tcttcagagg ccaaaaaaaa 1260 

aggtacaaac acatactaat ggaacattct ttagaagagc cccaaagtta atgataaaca 1320 

ttttcatcaa agagaaaaga gaacaaggtg ttagcaaatt cctctatcaa ataacactaa 1380 

acatcaagga acatcaatgg catgccatgt ggaagaggaa gtgctagctc atgtacaaac 144 0 

cagtagataa tttcaacttg ctgccgaatg aaacctcttt gcaaggtatg aatcagcact 1500 

tctcatgttt gttttgcttt gttttgtttt gtttttagag acaggccctt gctctgtcac 1560 

acaggctgga gtgcagtggc acgatcagag ctcactgcaa cctgaaactc ctgggctcaa 1620 

gggatcctcc tgccttagcc tcccaagtag ctgggactac aggcccacca tgcccagcta 1680 

attttttaaa ttttctatag agatgggatc tcactagcac ctttcatgtt tgatgttcat 1740 

atacaacgac caaggtacaa tgtggaaaag ggtctcaggg atctaaagtg aaggaggacc 1800 

agaaagaaaa ggggttgcta catagagtag aagaagttgc acttcatgcc agtctacaac 1860 

actgctgttt tcctcagagc agagttgatg atctaaatca ggggtcccca acccccagtt 1920 

catagcctgt taggaaccgg gccacacagc aggaggtgag caataggcaa gcgagcatta 1980 

ccacctgggc ttcacctccc gtcagatcag tgatgtcatt agattctcat aggaccatga 2040 

accctattgt gaactgagca tgcaagggat gtaggttttc cgctctttat gagactctaa 2100 

tgccggaaga tctgtcactg tcttccatca ccctgagatg ggaacatcta gttgcaggaa 2160 

aacaacctca gggctcccat tgattctata ttacagtgag ttgtatcatt atttcattct 222 0 

atattacaat gtaataataa tagaaataaa ggcacaatag gccaggcgtg gtggctcaca 2280 
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cctgtaatcc cagcacttcg ggaggccaag gcaggcggat cacgaggtca ggagatcgag 234 0 

accatcctgg ctaaaacggt gaaaccccgt ctactaaaaa ttcaaaaaaa aattagccgg 24 00 

gtgtggtggt gggcacctgt agtcccagct actcgagagg ctgaggcagg agaatggtgt 24 60 

gaacctggga ggcagagctt gaggtaagcc gagatcacgc cactgcactc cagcctgggc 2520 

gacagagcga tactctgtct caaaaaaaaa aaaaaaaaaa aaagaaataa agtgaacaat 25 8 0 

aaatgtaatg tggctgaatc attccaaaac aatcccccca ccccagttca cggaaaaatt 2640 

ctcccacaaa accagtccct ggtgccaaaa aggttgggga ccgctaatct aaataatcta 2700 

atcttcattc aatgctaaaa aatgaataaa ctttttttta aatacacggt ctcactttgt 2760 

tgcccaggct ggagtacggt ggcatgatca cagctcactg tagcctcaat cacccaggcc 2 82 0 

ccagcgatcc tcccacctaa acttcctgag tagctgggac tacaggcacg caccaccatg 2 8 80 

cccagctaat ttttaaattt tttatagaga tgggggtctc accatgttgc ccagactggt 2 94 0 

ctcaaaccct gggctcaagt gatcctccct caaactcctg gactcaagtg atcctccttc 3000 

cttggcctcc caaagtgctg ggattacaag catgagccac tgtacccagc tggataaaca 3060 

ttttaagtcg cactacagtc atggacaatc aggcttttca acatgcagta tggacagtga 3120 

gtcccagggt ctgcttttcc atactgaaat acatgtgata ctaaggagaa aggtgctcgc 3180 

aaggatattt aaaatgaaga atatttaaaa tgaggaaaaa actgtttctt catgactttg 3240 

ataaggctga taaagaccat ttctgtgatc tcaggtgatt cactcaagta gtatatttca 33 00 

gtaatcacta tctggaacag cctgaatctt aaccaaaata ccatgatttt ttaatgctgt 3360 

tatga:acct tgatgatatg accaaactgc aatgtaggca gctaaatctc cacgagtttg 3420 

acttccccga gagttgacag ttttcttcac aaattaaaga aatatatttt ttgatacatg 3480 

attggcarat ttaaaaacfca cactgaaatg ctgcaaaatg atataaagaa acattttcca 3 540 

gaatcaaatg caatcaaaga gtggattagg aatctactca ccattatcaa ctaaatagaa 3 600 

acacttggac tgggtgtggt ggctcacatc tgtaatctca gcactttggg aggccaaggc 3660 

aggtggartg cctgaggcca ggagctcaag accagcctga gcaacatagc aaaactctgt 3720 

ctctacaaaa aaaaaaaaaa attaaccagg catggtggca gatgcttgta atcccagcta 3780 

ctctggaagc tgaagtagga ggactgcttg agcccaggag atcaagactg cagtgagccg 3 840 

tggtcatgct gcgccacagc ctgagtgaca gagagagacc ctgtctcaaa aacaaaaaca 3 900 

aacaaaaaac acttaacctt cctgtttttt gctgttgttg ttgttgtttg tttgttttga 3 960 

gatggagtcr cactctgttg cccaggctgg agtgcagtgg cgtgatcttg gctcactgca 4020 

agctctgcct cccgggttca cgccattctc ctgcctcagc ctcccgagta gctgggacta 4 080 

taggcgcccg ccaccacgcc cggctacttt tttgcatttt tagtagagat ggggtttcac 4140 

cgtgttagcc aggatggtct tgatctcctg acctcgtgat ccacctgcct cggcctccca 4200 

aagtgctggg attacaggca tgagccaccg cacccggcca acctttctgt tttttagttt 4260 

gatatgcttg ttaactcagc agctgaaaga atgctgaaag tggccttcag taaaaaaatt 4320 

tcactagaat ctctacatcc atatttaatc tgaatgcata tccagattga tcagttagag 4380 

caaaaacact catcatcatt cctgatgacc tctaattctg gtttcggctt tctatttcaa 4440 

tggaaacaga ataaggaaag aaatggaagg gctctggaaa tttgtcctgg gctatagata 4500 

ctatcaaaga tcaccaacaa taagatctct cctataaata taaaacaagt ataattaatt 4560 

ttttaattat ttttttctct tcagaggatt ttatttcaag ataaaacata acttctaccc 4620 

atactattga ttccaaaggt tagaaaaagt gtttttcctc atcttatcct tcaaagaggt 4 680 

cacagcaatg caaacatcta taaaatgcct ctgcataatt gtcagaagct atagtccaga 4740 

aatcattgaa aatgcttttc cattttaagc ttaggtgagg tgtcttagga aacctctatg 4 800 

acaacttact ctatttattg ggaggtaaac tcccagactc tcccagggtc tcctgtattg 4 860 

atctcatttt ttaggcttcc taatcccttg aagcacaatc gaaaaagccc tggatctctt 4920 

ttctgcacat atcatcgcgg aattcattcg gcttccagca agctgacact ccatgataca 4 980 

agcggcctcg ccctztctccg gacgccagtc cttgctgcgg ttagctagga tgaggggttt 5040 

gctgggcttc agtgcaggct tctgcgggtt cccaagccgc accaggtggc ctcacaggct 5100 

ggatgtcacc attgcacact gagctcctgg caggctgtac caatttttta attatttaat 5160 

atttattttt aaaattatgg tgaatatttt ggtattctgc tctaaaatag gcccataaat 5220 

gcacagcaga tatctcttgg aacccacagc tttccactgg aagaactaag tatttttctt 5280 

ttaaagatgc tactaagtct ctgaaaagtc cagatcctct acctctttcc atcccaaact 534 0 

aagacttgga atttatgaga gatctagcta acagaaatcc cagacacatc afctggttctt 5400 

cccagagtgc agtcctccta aagaggctca gccctaagca ggcccctgca ccaggagggt 5460 

gggtctgaga cccacatagc acttcccaag gtgcatgctc cagagaggca ctgaaacagc 5520 

tgagcacaag cctgcaagcc tggagaactc tcacagtcag aacggagggg gcccagtggg 5580 

actaacataa agagaaaagg gaacacagag aaatggatgg caccaacaac cagcaaagcc 5640 

ttcatggcca atgaaagcat cagtgacggg gccagaaccc tcatccccaa agactcttca 5700 

ctgcctttag tgaaaaacaa tggctagaga gtgaagttat gatcatgtat agagaggtaa 5760 

agttacattt ttatattctg actctgctaa tgtgaaattc cctatctgct agactaaaag 5820 

tttcagacac cctgttcaaa tatcccatta gttgctagag acttaaaatg aacagaacgc 5 880 
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acattgtcag gatgactatt accaaaaaat caaaagacag caagtattgg tgaggatgta 5940 

gagaaactgg aacttttgtg cactgtttat gagaatgtaa aatggagcag ctgctgtgga 60 00 

aaagagtatg caggttcctc aaagagtaaa accaagatgt ggaaacaact aaatgcccat 6060 

cagtggatga aggggtagac aatatgtggt atatacatac catggagtac tattcagcct 6120 

ctaaaaaaaa aaaaggaaat tctataacat gcaacagcat ggatgaatct tgaggacatt 618 0 

ttgctaatga aataaggcag tcatagaaag acaaatactg cacgactcca cttatatgag 624 0 

ataccaaaaa tagacaaatt catagaatca aagagtacaa tggaggttac ctggagctgc 63 0 0 

agggcgggaa acgaggagtt actaatcaac gaacataacg ttgcagttaa gtaagatgaa 63 60 

taagctctca agatcagctg tacaacactg tacctagagt caacaataat gtattgtaca 6420 

cttaaaaatt tgttaagggt agattaacaa atgtagtaga tccacaaatg tggttaagtg 6480 

ttcttaccac agtaaaataa aaaaagaata tcaagcccag gagttcgaga ctagcctggg 654 0 

taacatggtg aaaccctgtc tctacagaaa atacaaaaat . tagccagctg tggaggtgca 6600 

ctcctaggga ggctgaggtg ggaggcttgc ttgagcccag gaggtcaagg ctgcagtgag 6660 

ccatgattgc accactgtac tccagcccag atgacagagc aagacaccac cccccccaaa 6720 

aaaagaaaaa gaatatcaaa cattttaaaa gatcagatac gcaagaacaa caacaaaaaa 6780 

gagatgaaca gagcatcgac cctcatctag tgggattctt ggtctaactg aaaaacagac 684 0 

attgagagac aaacaatgac agtgatgtga tcacagcaat tacacaggta tcccctgggg 6900 

actgcagaag aaaggaggaa tgcctaactt tcagaaaata gagaaagcgt caaacagttg 6960 

gtgaaagcct tccaaaacta gagagaactg cacacaccaa atcacagaaa gaagaaaagc 7020 

cgtgggagat tctgggaccc accggctatt tttgatggct gaacaccctg ctgcaggaga 7080 

gacaggagct ggaaagcatg gfcgggatgaa acctcaaaca gctttgcctg cattgcttaa 7140 

gatgactggg cttgattaac tctagtcaat ggggacaatt caatcaaaga agaaagatgc 7200 

tcaaattcac attttagaat gattttttat ggcagtatgg ggaatagatt aaaagagagt 7260 

gaagctggag gcaagaaact tgttaagagg caactgaaac agtctagatg ataaataata 732 0 

aactgacaga gtgactagaa aaatcagaac aggctgaatc aacagatacc tagatgaaaa 73 80 

taacaggact tgatcaccag ttgtatcttg gagaggaagg agttgtttcc ttgctttccc 744 0 

tacgactggg aatacggaag gtttgccgtg tgtattggtt atatactggt gtgtagccaa 7500 

tcactgacaa ccatttagca gcttaaaaca caaaggctta tctcccagtt tctgtgggcc 7560 

aggaatctaa gataggctta gctggctggt tctggctcag agtttctcaa gaggttgcaa 7620 

tcaagatgtc agctggggtt gcatcatctg aaggctcaac tggggccgga gggtccactt 7680 

ccaaggagtt cactcacctg cctgacaagg cagtgctggt tgttggcagg agatctcaat 774 0 

tcattgccaa gtgagcctct ctatagcatt gctggaacat cctccccatc tggcagttgg 7800 

cttctctcag catgagtgat ctgagagaga gagcaaggag gaagccacag tgttcttcct 7860 

actcctactc ctaacactat ggacctactc ctaacactct cacttctgcc ttattccatt 7920 

agttagaaag ggaactaagc tccacctctt gaaataagaa gtgtcaaaga atttgtggat 7980 

atatttaaaa atcatcacac tgtggaagtg gatagggggt tcaattaatg ctgaacttga 8040 

aatgcctgag acattcaaat gtccaacagg caatgaacat acccatagat ggtcatgact 8100 

ttagcaagaa tagaggaaga tcacagaatt aaggaggaat tgaaaggtaa aagaagtgga 8160 

gtcagattcc ccctgaaaag tgagccatga aaggaacttt aactattgag ttagaggtca 8220 

gagtaggaaa tttcggtgga attctttttt aaagaaagga accatataag catgttttga 8280 

ggtagaggga gaataaatca gtagacaggg agaggtaaaa aacataaatg ataggggata 834 0 

gttgacaaag gtcttggcag aatcccttac ccattgactt ggggccaaga gagggacact 84 00 

tctttgtttg agggataagg aaaataagaa agaatgggtg ctatttagtg tggtcctgtc 8460 

tctagggcaa acgcataggt aacaaactgt gtgtgttagg aatatagatg tgacctcaca 8520 

ttgagattct cacctcaaat ccattttgtt gttacctgta ccttcctacc ttctcttttt 8580 

gctacatgca gactgctgtt ttgtcttcct ggcctgttcc aggtttcagc attctggcat 8640 

atctgctacc ctgttcccaa acctctctag agtccatgct ccttccttgg atagtgtttg 8700 

attgggccac gtatctaaga agtgatgcct tcagttaggc ctgagaacct cctctatgga 8760 

aatctccatc agtgaccctg acagacttgg tatcttggag atgtcactgc tcccagcctg 8820 

tggtctagga gaatctcagc ctgggcctct agtagtatgg ataaggcgtt aaggtatctt 8880 

tgaaccagag tctgtcatat tcctcaatgt gggacagata aaacagtggt agtgctggtg 8940 

tttctgagct agaactctgg tttttggtct agattctttg atgtatgacc tttcagaggt 9000 

attaaaattt gttctaatac aatgttcaat acaaatgtag ttccttttct gttaggacct 9060 

caacaaaaca tgaccaactg tagatgaaca ttaaactatg acaattcatg gaaatgaata 9120 

cagtaatacc tgcggttccc ccattttagc agtcactatg gtgacatttg gcacaaatgg 9180 

ctatttaagg gtgcttttgt taaaacctac catcttacta ggcacatgat attgaaacta 9240 

atgaaataat ggagaaactt cttaaaaact tttaatgaat aaagtgatga agtgataata 9300 

ttttagctgc tatttataaa gtgactatta caggtcaaac attcttctag ggtttttttg 9360 

ttgaagttgt cacatttaat ccttaataac ccactatgag tcaggtattc ttctctcccc 9420 

tttggacagt tggggaaatg ggggtcagag aggttaggta atttgctcag ggccacacaa 9480 
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cctgcatgta gaaaatctga gatttgtaca ggaacgtatc aaactctgaa gtccatgctt 954 0 

ctattttccc atgctgcctt tctaataaaa ggtaactaat gctactggat gctgccccca 9600 

aagtgagtca ctttcacccc accctacttg attttctcca taaaactaat cacatcctga 9660 

caacttattt attgctgatc tcccccacta gattataaac tcaataaaag caagatcctt 9720 

gtctgctgaa tatcagtacc taaaacgctg tctagcacag agcaagtaat taatatttgt 9780 

tgaatgaaca aataaaggaa aaaaattcaa aggaagaaaa agccctaaaa cagatgttta 984 0 

cctaaacata cattttaaaa gaaagcatat aacaaattca ggacagaatt taaatttgat 9900 

tttttaaaga aataaccaag tgctagctgg gcacagtggc tcacacctgt aatcctagca 9960 

ctctgggagg ccgaggcagg cagatcactt gaggtcaaga gttcaagacc agcctggcca 10020 

acatggtgaa acctgtctct actaaaaata cagaaattat ccaggcatgg tggcaggtcc 10080 

ctgtaacccc agctactcag gaggctgagt caggagaatt gcttgaaccc aggaggcaga 10140 

ggttgcagtg ggccaagatt gcaccactgc actccagcct gagtaacaaa gcaagactct 102 00 

gtctgaagga gaaggaaaga aagaaggaaa gaaggaaaga aggaaagaag gaaagaagga 10260 

aagaaagaaa gaaagaaaga aagaaagaaa gaaagaaaga aagaaagaaa gaaagaaaga 1032 0 

aagaaagaaa aagaaagaaa gaaagaaaga accaagtgct tatttgggac ctactatgct 103 80 

atgtttttcc atgcacgcta ttttcagtaa agcagttagc aaacttgcaa gatcataaca 10440 

acaaatatat gcttctataa ctctaaaatt gtgctttaag aagttcctct ttaccagctc 10500 

atgtatgcat tagttttcta agagttacta gtaacttttt ccctggagaa tatccacagc 10560 

cagtttattt aaccaaagga ggatgcttac taacatgaag ttatcaaatg tgagcctaag 1062 0 

ttgggccagt tcatgttaat atactccaga acaaaaacca tcctactgtc ctctgacaat 10680 

tttacctgaa aattcatttt ccacattacc aaggagccag ggtaggagaa tatagaaaga 1074 0 

ccacccaaga atccttactt ctttcagcaa aatcaattca aagtaggtaa ctaaacacat 10800 

gccctaacaa tgaatagcag attgtgctca gaagaatgat ctacaacatc ttactgtgaa 10860 

ggaactactg aaatattcca ataagacttc tctccaaaat gattttattg aatttgcatt 10920 

ttaaaaaata ttttaagcct aaattttaaa aggtttgata ttggtacatg aatagacaaa 10980 

cagacatgga ctagaccaag aattaggttc aaacatatac aggaatttaa tatacgataa 1104 0 

atctagtatt ccaaaggaac caacaaatgg tgttcagaca gcaggatagg catcaggaaa 11100 

aacacagttg ggcaccctac cttactccta acaccaggag taactgaagg agcaccaaat 11160 

atttatttat tttaattata gttttaagtt ctagggtacg tgtgcacaac atgcaggttt 1122 0 

attacatagg tatacatgtg ccatgttggt gaggagcacc aaatatttaa aagaaaaaaa 112 80 

ttggccaggg gcggtggctc acacctgtaa tcccagcact ttgggaggcc aaggtgggca 1134 0 

gatcacctga ggtcgggagt tcgagaccag cctgagcaac atggagaaac cccatctcta 114 00 

ctaaaaatac aaaattagcc aggcatggtg gcacatgcct gtaatcccag ctacttggga 114 60 

ggctgaggca ggagaatagc tttaatctgg gaggcacagg ttgcggtgag ctgagatatt 1152 0 

gcactccagc ctgggcaaca agagcaaaac ttcaactcaa aaaaattaat aaataaataa 11580 

aaataaagaa agaaaagaaa aaaatgaaaa tagtataatt agcagaagaa aacaccgtag 1164 0 

aatcctcgga ctcttaggat ggggaatgcc tataatataa aaaccctgaa gttataaaag 11700 

agaaaatcac ctacatacaa accaaatctt tctacatgcc taaaacatag cacaaacaca 11760 

gctaaataat catagctgaa tgaactggga aaacaaaact tgactcatat ccagacagag 11820 

ttaattttcc tacacataaa gagtacctat ataaacccaa caaaaaaacc accactaacc 11880 

caaaataaaa atgtgacagg taatgaacag gtagttcaca gagaatacaa atggctcttc 11940 

ggcacataag atgctcagac tgacttttac ttatttattt tttgagagac agggtctcac 12000 

gatgttgccc aggttaggct caaactcctg ggctcaaatg atagtaccag gactacaggt 12060 

gtgccccacc gcacctggct cctcaaccac ctgtattaac aggaaatgca aaataaaact 12120 

ttcaaatcta ttttacctat tagaatggca aaaatttgaa aaacttcaaa catcatcatg 12180 

ttggtgagaa tgtgaggaga ctggcactct cattttttgc tgatagcata tatatactga 12240 

tggcttctat ggaaagcaat ctggcagcgt ctatcaaatg tacaagtgca tatatccttt 12300 

gacaaagcaa ttccactcta ggaatgtgtt ctatatggtt gtgcttcctg gggctgggaa 12360 

ctgggagcta agggacaggg gcagaagata atcttctttt ccctccttcc ccgttaaaca 12420 

tgttgaattt tatatactgt aatatattat ttttcacaaa agataatttt taagcgatat 12480 

gtctgggaat tttttttttt cttttctgag acagggtctc actctgtcat ccaggctgga 12540 

atgccatggt atgatctcag ctgactgcag cctcgacctc ctgggttcaa gcaatcctcc 12600 

cacctcagcc tcctgagtag ctgggactac aggcacgtgc catcatgcta atttttgtat 12660 

atacagggtc tcactatgtt gcccaggcta atgtcaaact cctaggctca agcaatccac 12720 

ccacctcagg ctccaaagtg ctgggattac aggcgtgagc caccgcgcct ggccctggga 12780 

attcttacaa aagaaaaaat atctactctc cccttctatt aaagtcaaaa cagagaagga 12840 

aattcaacct ataatgaaag tagagaaggg cctcaaccct gagcaacaaa cacaaaggct 12900 

atttctgaga caggaatttg ctgaacaaaa tcgagggaag atgacaagaa tcaagactca 12960 

cttctcggct gggcgcagtg gctcacacct gtaatcccag cactttggga ggccgaggcg 13020 

gacagatcac gaggtcagga gattgagacc atactggcta acacagtgaa acccagtctc 13080 
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tactaaaaat acaaaaaatt agccgggcgt ggtggcaggt gcctgtagtc ccagctactt 1314 0 
gggaagctga ggcaggagaa tggcgtgaac ccaggaagcg gagcttgcag tgagccgaga 1320 0 
tcacgccact gcactccagc ctgggtgaca gagcaagact ctgtctcaaa aaaaaaaaaa 13260 
aagactcatt tctctagatc ttgagccgta ttcaaattta tctcagctta gtgagaggtt 13320 
aaagcaagga atatccttcc ctgtgggccc tgctccttac tgaaggaagg taacggatga 133 80 
gtcaaggaca ccaatggaga aaagcactaa caccattatc tgatgaacat tacgtgaaga 1344 0 
agggtaagaa gtgaagtgga attgctgaag aagtcagtga aagcggacat tcatttgggg 13 5 00 
aaatggaata taggaaatcc ataaaagtga ttaaaaagat gttagaggct gaggcggggg 135 60 
gaccacaggg tcaggagatc gagaccatcc tggctaacac ggtgaaaccc catctctact 13620 
aaaaatacaa aaaattagcc aggcgtggtg gcaggcacct gtagtcccaa ctactcggga 13680 
gactgaggca ggagaatggc atgaacctgg gagacggagc ttgcagtgag ccgagatcac 13740 
gccactgcac tccagcctgg gtgacagagt gagactccat ctcaaaaaaa aaagttagat 13800 
acgagagata aagatccaac agacacacaa ctgctaattc tgaacagaac aaaacaaatg 13860 
gcacaggaaa agaaaattta agatataaca ccggaaaact ttcctgaaat tgagtaactg 13920 
aatctatagc ttgaaagggt ttagcatatg ccaagaaaaa tcagtagagt ccaaccagca 13980 

caagacacat ctagcaaggc tggtgattct accaacacag agaaagaagt gggtgaccca 14 04 0 

taatgcggaa aaaggcagac catctgcagt cttctccaga acactggagt ctgaagacaa 14100 

aagaatgctg cctactgagc cagaagggag agaaagtgac ccaacacatc tttaccaagt 14160 

tagaatgtca cgcattattt aaaggctgca aaagccatga aagacatgaa agaacacaag 1422 0 

catttacaac atgaaagaac acaagcattc tcatactcaa gaatccttaa gaaaaatgta 14280 

gtcctaatcc agcccactga aagttaaatg tacttaatgt gctcattaat gggaacttca 14340 

tagcttcaaa tcagtctggt cccatctacc aacatctctc gcccggcttt cctgcaatag 14400 

tcagcacctt tccctcctcc cagtcttgtc ccctggagtc tgctctcagc atagcagagt 14460 

gaccacatca acacccaagt cagagccctc cagtgcgcac tggtctacaa agcccttccc 14520 

accccccacc ccacgtgccc tccggatcct tgtgacgtgt ctcctgcata ccctagcagc 14580 

cctggcctcc tcactgcccc tcctgtacat caggaaggcg actccttgag tcttggctct 14640 

ggccgcctcc tccacctgca gtgagttaac tcccttacct actctaggtc attgctcaaa 14700 

tgtcagcatc tcaatggggc cctccctgac taccctattt aaattctaca tactcccctt 14760 

gaccccatgg acctcactca ccctattcca cttttattct tacaatttag cacttgttct 14820 

cttctaacgt attctaagac ttactcattt attacattgt ttgccacccc ctctagtaca 14880 

taaactccag aggggcaggg atttctgtct atttattcat ttctttatcc ctaggacata 14940 

gaacagggca tagttcagag tattcaatgt tatcaatgaa tgaactagca gtagtaccag 15000 

ttccagtrag gcacagaatt aaatctaaat agaattaaat ctcatggtct gggttaacta 15060 

tggatagaaa attagatata attttaagaa gcctagaaag aaaaaattaa taatgtaaaa 1512 0 

ataatattaa tttgataata ataacaaaaa ctctgccagg cactgtggct caaatctgca 15180 

atcccagcta ctcaggaggc tgaggtggaa ggatcacttg agaccagagt tcaagactca 1524 0 

gcctaggcaa cacggcaaga aactgtctct aaaaaaatta aaacttaaat ttttaaaaaa 153 00 

gaattctcaa agcgtcacaa aaactggaga ttaaggtaca ggaagtgtga agtaatatta 153 60 

ctatgctaat ggtttttttt ttttttagaa aggtataacc aaaagatttc tttctcaagt 1542 0 

cgataaactg agaaagataa gcatatcttc caattaacag agggggagga aaagccagat 15480 

acaacaaaat aagatataaa ttagtttcca gttgaaaaca agagtaggag ttattttgca 1554 0 

tcacctcacc tgtgacctcc cccagcccaa aaaacactac tgataaacag ggtagaaaag 15600 

catcatctca gataaagcag gaaaaactgc cacagtctca aaccacaaac tataagcaca 15660 

cacctggcca accctgccaa gtctgggctc agtaggagga acgtgctgag agctaggatg 15720 

taccaactta gacattctgt gggatacaga tgtccctgga agggtcacac catctcaaag 15780 

gcacctgtaa tgcccactga ttacagccac catatgtgag agagaaactc agggcactta 15840 

gagagtataa caagaacctt atgtcatctg agatgaggaa tcctcagccc tgcaaattaa 15900 

ccaactcttt agaacaactg gcaaaacata aatatccaca acttttgttt cagtaattcc 15960 

actcttagat atcaatccaa agtacatgag acagcagata cacacacaaa atggtattta 16020 

ctgcagcatt gtttataata gcaaaaaaca agaaataatc catatgtctc aataggatac 16080 

tgggtacatg agggtatgta cccatcattc aaccatcaaa aagagtgata tggatgtcca 1614 0 

cagatggaca taaaaagctg tgtgttacgt gaaaacaaac tcaagcagca gcaggatggg 16200 

cttatgatag tcagtatgag ctaatttctg gaaaaaaaaa tctagtgtgt gcacagaaaa 16260 

catctgaaag aacagaaaca aaactatcag cagaatattg agatgtttta ctaagttgta 16320 

tatctatact gcttgtaatt tttaccccaa gcaagaatta ctttttggaa aaagaaaatt 16380 

caggaaataa agcatttctt taaacttcat gtttaaacaa atggtgatgg aataaaagag 1644 0 

ttcttattca tcataaacac acacagcaca catgcacgca tgtgcgtgag cacacccttt 16500 

acttgataaa taccatgttg aatattttag tctttccttt taggttctat cccttcactc 16560 

aaaatgcggt tataaataaa tgtacttttc atgtgccttc tgcctaaacc cactttaata 16620 

taactttaca gtcccattat cattatagtc tcaaagctag actcagcctg aaactaccct 16680 
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ttcatttgga acccttatta aaatgccaca tacagctcct tcaaataaaa acaaacccta 16740 

gqacctgaca ctaggcttcc tttgttgcta ctcataatgg ccaagttctg tgcttataat 16800 

acaicrtctt tcattttatt gctacatatc caagggtttt atatgttttt cttattatat 16860 

cttaattcaa aacaccatca cgctcttttc cagatgaaaa taaggaaaag aaattgagca 1692 0 

actuactgac ttaaaggtca taaaactata tagtagcaga gtcagcaaaa gaagaaacac 16980 

acatctccca agtagaggct gaaaaccagt accattcacc tccagggtga gctatataca 17040 

gat:acaaag tcaccttctc taaatgttca aactgaatcc catacccata ctttaccact 17100 

acctcgtaag aacagcctca gatcttgtta tagccttttt tttagcatgc tgaagccaat 17160 

aaaatgcttc ccattcagca agagaaacaa gttctgaaac actgaataat ctgcccaggg 17220 

cc:atgaaca tttccactgt gagaaatgtt ctccactgtg tggagaagat ccttactctt 172 80 

ctrcaracag gcagaacatt agaaaaattc ttggattcta tgatgcacag cttaggagtc 17340 

tg:ttjqcac aatttaagtc caaatagtta ttaaatcctc ctctgttcca gaaacagtgc 17400 

:a;tatuctgt gaatataaaa attgaaaaga tactctcctg gctcccaaga aagtcagcca 17460 

gat -in/i-^qag acacaggcac acaaatcact gtcacatgaa gctctacctc cctaacttca 17520 

aanai^cc taagtcacca agaatacagt agcagttgtg actacgagta actactataa 17580 

tt ca.it arit tatcttccct tagaaaactc ttctcccttg gaaatttatt tgcatttcta 17640 

aaia;:a::c cttactaaaa ggaagcaggg ctccttgggg aaatagctga ttctaggtgt 17700 

gcacta'.qan atgaaaatgg tgagtctggg acatcccatg ttgcccagaa atcaaggaac 17760 

tg:c:a.Kiia ttaacagagt catgttaaat ggacctaaga gtgaaccaga aggagctcac 17820 

ttt3::::cqc gtggaacaat ttcaagaaaa acatgacagt aatgaattat aaaacatgaa 17880 

ttaa.i.it uca tattggtact aaaaagagaa caaaaggatg tggctttgga taaagctctt 1794 0 

cttcat.j.iaa gaataccagc taataaatgt aaaggaaatg agagaattag aaaaattatc 18000 

attttotaaa ccttaatata ttcacctaga catgctaaaa ccactgagta aaaggctgct 18060 

tag:; t n.iaq atgctcacat gatctcagag tttcacacca cagataattt attagataca 18120 

ggadii.i.na tgtgatcaag cttcctgtga cccccagcca ggccccacaa cactatgtgc 18180 

ctccttotga tgtgggagct acacagcatc gcccacacag cttctcgcca aaactgtttg 18240 

aagct-t.it era caagggaaga actggacagc ttctgaccat gagacgctcc accagacaac 18300 

ttgcttg^cc tctccaaaga aacttgcttg gcctctccaa agaaaactca gtttcattta 18360 

aaaacaaaac taattattta aaaacaaacg aaaagcaagt tgtggacttg agctccaggg 18420 

acagaqcaga catacttttc cctgttcttc ccagtaagtg gtaataaaaa ccctcaacac 184 80 

tagatataaa acaaatataa gaaggttctg gaaggggaag aggaggcaga ctatccaggt 18540 

gccttyc*:jgc ccacagaaca acccagtgat gggttcactg ggtcttcttt ttgcttcatt 18600 

atctcagact tggagctgaa gcagcaggca acttcaaaac accaaggggc acagattgaa 18660 

aagccccaag aaaagcctgc cctctctagc caaaggacca ggaaggagac agtctaatga 18 72 0 

gatggaacac atttagacag taactgccca tttaccagca ataactgagc agggagccta 18780 

gacttccagt cttgtgagga cgtaccaagg tacccaacac ccccaccaag gctgagtaag 18840 

gactgcgact tttatccctg catggcagta gtaaggagcc catccctcac ccgccagcag 18900 

tgtcagggga acctggactt ccactcccac ccaggagtga tgaggccctc cctgctgggg 18 960 

tcatgtcaga ggaggcctag tggagattca gtgacttaac cttttcccag agataatgag 19020 

gccacctttc ctccctcttc ccccatggtg acagtgaaag cactgtggca agcagtaggc 190 80 

actcctaccc ctcctagcca gggaggtatc agggaggcca agtagggaac cagaataccc 1914 0 

acaaccaccc agcagcaaca ggggtccccc accccattgg gtgtcaatgg aagcagagcg 19200 

gaaagcctgg atatttaccc ccatctagaa gtaacaagct gatgtccccc ttcttctact 19260 

acaatggtgt tcaaaacagg tttaaataag gtctagagtc tgataacgta atacccaaat 19320 

cgttgaagtt ttcattgagg atcatttata ccaagagtca ggaagatccc aaactgaaag 19380 

agagaaaaga caattgacag acactagcac taagagagca cagatattag aactacctga 19440 

aaggatgtta aagcacatat cataagcctc aacaggctgg gcgcggtggc tcacgcctgt 19500 

aaccccagca ctttgggagg ccgaggcagg tggatcacaa gatcaggaga tcgagaccat 19560 

cctggctaac acggtgaaac cccgtctcta ctaaaaatac aaaaaaaaat agcaaggcat 19620 

ggtggtgggc acctgtagtc ccagctactc gggagcctga ggcaggagaa tggcatgaac 19680 

ctgggaagag gagcagtgag ccgagatcgc accaccgcac ttccagcctgg gcaacagagc 19740 

aagacttcgt cccaaaaaaa aaaaaaaaaa aaaaaaaagc ctcaacaaac aactacaaac 19800 

gtgcttgaaa caaatgaaaa aaaaatcttg gcaaagaaat aaaagatata tattttggcc 19860 

aggtgcagtg gctcacagcc tgtaatccct gcactttggg aggctgaggc aggcggatca 19920 

cctgaggtca ggagtttgag accagcctga ccaacatgga gaaaccccgt ctctactaaa 19980 

aatacaaaat tagccagtca tggtggcaca tgcctgtaat cctagctact caggaggccg 2 0040 

aggcaggaga atcgcttgaa ctcaggaggt ggaggttgcg gtgagccgag atcccgccat 20100 

tgcacattgc actccagcct gggcaacaag agcaaaactc catctcaaaa aaatagatac 20160 

atattttaat ggaaatttta gaattgaaaa atacagtaac caaattgaat ggaaagacaa 20220 

catagaatgg agggggcaga caaaataatc agtgaacttc aacagaaaat aatagaaatt 20280 
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acccaatatg aagaacagaa agaaaataga ctggccaaaa aataaagaag aaaaaagagg 2 034 0 

agcagcagga ggaatgatgg aaaaagagaa aggaaggaag gaagggaagg agggagggaa 2 0400 

ggagtgaggg agaaagtctc aaagacctct gagactaaaa taaaagatct aacacttgtc 2 046 0 

atcagggtcc aggaaagaga caaagatggc acagctggaa acgtattcaa aaaataatag 2 052 0 

ctgaaaactt cccaaatttg gcaagagaca taaacctata gattcgaaat gctgaacccc 2 058 0 

aaataaaaag cccaataaaa tccacaccaa aatacatcat agtcaaactt ctgaaaagac 2064 0 

gaaaagagaa aacgtcttga aagcagtgag tgaaacaaca cttcatgtat aagggaaaaa 2 070 0 

caattcaagt aacagatttc ttacagaaat taaggaagcc agaaggaaat gacacaatgg 2 0760 

ttttcaagtg ctgaaagaaa agaagtgtca acacaaaatt ctagattcag taaaaatatc 20 820 

cttcaagaat caatgggaaa tcaagacagt ctcagataaa gcaaaataag agaatatgtt 2088 0 

gccagcagat ctcccctaaa ggaatggcaa aaggaagatc atgcaacaga ccaaaaaatg 2 0 940 

atgaaagaag gaatccagaa acatcaagaa gaaagaaata acatagtaag caaaaataca 210 0 0 

tgtaattaca ataaaatttc tatctcctct taagacttct aaattatatt gatggttgaa 2106 0 

gcaaaaatta taaccctgtc tgaagtgctt ctactaaatg tatgcagaga attataaatg 2112 0 

gggaaagtat aggtttctat acctcattga agtggtaaaa tgacaacact gtgaaaagtt 21180 

acatacacac acacacgtaa gtatatataa atatatgtgt gtatatgtgt gtgtatatat 2124 0 

atatatacat ataatgtaat acagcaacca ctaacaacac tatacaaaga gataataacc 21300 

aaaaacaatt tagataaatt gaaatggaat tctaaaaaat attcaaatac tctacaggaa 21360 

gacaagacaa aaagagaaaa aaagaggagg acaaactaaa ttttttaaaa acataaataa 2142 0 

aatggtagac ttaagcccta acttatcaat aattacataa atgtaaatga tctaattata 214 8 0 

tcaattaaaa gacagagata gcagagttaa tttaaaaaca tagctataag aaacctgctt 2154 0 

tgggctgagt gcagtgactc acacttgtaa tcccagcact tcgggaggcc aaggcgggtg 21600 

gatcacctga ggtcaggagt tccagaccag cctggacaac atggtaatac cccatctcta 21660 

ctaaaaatac aaaaaaatta gccaggcatg gtggcacacg cctgtagtcc caactactca 21720 

ggaggctgcg acacaagaac tgcttgaacc cgggcagcag aggtagcagt gggccaagat 21780 

tgcgccactc cagcctgaac gacagagtga gactccacct cagttgaaaa acaaaaaaga 2184 0 

aacctgcttt aaatatacca acatatgttg gttgaaatta aaagaataaa atatatcatg 21900 

aaaacattaa tcaaaagaaa ggagtggcta tattaataac ataaaataga cttcagagaa 21960 

aagaaaattt caagagacag gaataaaagg atcaagaaaa gatcctgaaa gaaaagcagg 22 020 

caaatcaatc attctgcttg gagattcaac accctctctt aacaactgat agaacaacta 22080 

gacaaaaaaa tcagcatgga gttgagaaga acttaacacc actgaacaac aggatctaat 22140 

agacatttac ggaacactct acccaacaat agcaaaataa acattctttt caagtattca 22200 

ctgaacatat ccttagaccc taccctgggc cataaaacaa agctcactag tgattgccga 22260 

aggcttggat ggacagtgga agagctgcat ggggagggag aaggtgacag ttaaagagtg 22320 

taggatttct ttttgggata atgaaaatgt tccaaaattg attgtggtga tgttggcgca 22380 

actctacaaa tataaaaaag gccattgaat tgtacgtttt aagtgggtga aacata'tggt 2244 0 

atgtggatta tatctaacgc tttttaaaaa cttaacacat ttcaaagaat agaagtcata 22500 

cagagtgtgc tctactggaa tcaaactaga aagaggtaac tggaggataa cgagaaaagc 22560 

ctccaaatac ttgaaaactg gacagcacat ttctaaaatc atccgtgggt caaagatatt 22620 

catttctgat attcattttt attgtttaat gtatttttaa aaatttctta agggaaataa 22680 

actgactaaa aatgaatatg gctgggtgcg gtggctcacg cctgtgatcc cagcactttg 22740 

ggaggccgag gctggtggat cacaagatca ggagttcgag accagcctgg ccaagatggt 22800 

gaaaccccgt ctcaactaaa aaactacaaa aagtagccaa gcgcagtggc gggagcctgt 22860 

ggtcccagct acttgggagg ctgaggtagg agaatcgctt gaacacaggc agcagaggtt 22920 

gcagtgagcc aagattgtgc cactgcacgc cagcctgggc gacagagact gcctcaaaaa 22980 

aaaaaaaaaa aaaaagaata tcaaaatttg tgggacatag ttaaagcaat gctgagaggg 23040 

aaatttataa cactaaatgt ttacattaga aaagagaaaa agtttcaaat caatagtctc 23100 

cactcccatc tcaagaacac agaagatgaa gagcaaaata aacccaaagc aagcaaaaga 23160 

aagaaaatat aaaaataaat cagtaaaatt gaaaacagaa acacaataaa gaaaatcagt 23220 

gaaacaaagt actgattctt cgaaagatta ataaaattga caaacctcta gcaaggctaa 23280 

caaacaaaaa agaaagaaga cacggattac cagttattag aatgaaagca taattagaaa 23340 

caactctaca cattataaat ttgacaatgt agatgaaatg gactaattac tgaaaaaaca 23400 

caaattacca caactcaccc aatatgaaat agataattgg gatagcctga taactactga 23460 

gaaaattgaa tttgtaattt taacactctt aaaacagaaa cattaaactt aatattttat 23520 

aaatattaga taaggtaatt atacccttcc ttaacaaata aaaacgacaa attattttgc 23580 

agctaaagag atgtatgtac tgtgaaaaat atcttcagaa aaatagaact ttgtttgaag 23640 

aataaggatt taaaaaatgt ttttaactct caagaagcaa atatctgggc ccagatggtt 23700 

tcactgaaga attctaccaa atgtttaatg aagaattacc accaactcta catagcatct 23760 

ttgagaaaac tgaagagaag ggaacatctc ccagttcatt ttatgaagtg ggtgttactc 23820 

tgatactaga actgtataag gacagctact cttgacacac tgcctatggg tagctctgct 23880 
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ctgcaggaac agtcagaaaa aaaaaaaaaa gaagcactgg acaagggcag tataaaaaaa 23940 

gaaaactggg ccaggtgcag tggctcacac ctgtaatctc agcactttgg gaggctgacg 24 000 

ctggtggatc acctgaggtc aggagtttga gactagcctg gccaacatgg taaaaccctg 240 60 

tctctactaa aatacaaaaa ttagccaggc agggtggtgg ggaaaataaa aaggaaaaaa 2412 0 

aaacaaaaat aaactgcaga ccaatatcct tcatgagtat agacacaaaa ctccttaaac 2418 0 

tccttaacaa aatattagca agtagaagca atatataaaa ataattatac accatgatca 2424 0 

agtgggactt attccagaaa cgcaagtctg gttcaacatt tgaaaacaag gtaacccact 243 00 

atatgaacgt actaaagagg aaaactacat aatcacatca atcaatgcag aaaaaagcat 243 6 0 

ttgccaaaat ccaatatcca ttcatgatac tctaataaga aaaataagaa taaaggggaa 2442 0 

attccttgac ttgataaagc ttacaaaaga ctacaaaagc ttacagctaa cctatactta 24480 

atggtgaaaa actaaatgct ttcccctacg atcaggaaca aagcaaggat gttcactctc 2454 0 

attgctctta tttaacatag ccctgaagtt ctaacttgtg caaaacgata agaaagggaa 246 00 

atgaaagacc tgcagattgg caaagaagaa ataaaactgt tcctgtttgc agatgacatg 246 60 

attgtctcat agaaaatgta aagcaactag gggtaggggg gcagtggaga cacgctggtc 24720 

aaaggatacc aaatttcagt taggaggagt aagttcaaga tacctattgc acaacatggt 247 80 

aactatactt aatatattgt attcttgaaa atactaaaag agtgggtgtt aagcgttctc 24 84 0 

accacaaaaa tgataactat gtgaagtaat gcatacgtta attagcacaa cgtatattac 2490 0 

tccaaaacat catgttgtac atgataaata cacacaattt tatctgtcag tttaaaaaca 24960 

catgattttg gccaggcaca gtggctcata cctgtaatcc cagcatttta ggaggctgag 25020 

gcgagcagaa aacttgaggt cgggagtttg agaccagaat ggtcaacata gtgaaatccc 250 80 

gtctccacta ataatacaaa aattagcagg atgtggtggc gtgcacctgt agacccagct 25140 

acttgggagg ctgaggcacg agaattgctt gaacaaggga ggcagaggtt gcagtgagct 25200 

gggtgccact gcattccagc ctggtgacag agtgagactc catctcaaaa aaaataaaat 25260 

aaagcatgac ttttcttaaa tgcaaagcag ccaagcgcag tggctcatgc ctgtaatccc 2532 0 

accactttgg gaggccgagg caggcagatc acaaggtcag gagtttgaga ccagcctgac 25380 

caacatggtg aaaccccatc tctactaaaa aatatataaa ttagccaggc atgtgtagtc 2544 0 

tcagctactc aggaggctga ggcaggagaa tcacttgaac ccggaggcag aggttgcagt 25500 

gttgagccac cgcactccag cctgggtgag agaacgagac tccgtctcaa aaaaaaaaag 25560 

caaaataacc taattttaaa aacactaaaa ctactaagtg aattcagtaa gtctttagga 2562 0 

ttcaggatat atgatgaaca tacaaaaatc aattgagctg gacaaaggag gattgtttta 25680 

ggtcagtagt ttgaggctgt aatgcacaat gattgtgcct gtgaatagct gctgtgctcc 2574 0 

agcctgagca gcataatgag accacatctc tatttaaaaa aaaaaaaatt gtatctctat 2 5 800 

gtactagcaa taagcacatg ggtactaaaa ttaaaaacat aataaatact gtttttaatt 2 5860 

gcctgaaaaa aatgaaatac ttacatataa atctaacaaa atgtgcagga cttgtgtgct 2592 0 

gaaaactaca aaacgctgat aaaagaaatc aaagaagact taaatagcgt gaaatatacc 25980 

atgcttatag gttggaaaac ttaatatagt aaagatgcca attttatcca aattattaca 26040 

caggataaca ttattactac caaaatccca gaaaaatttt acatagatat agacaagatc 2 6100 

atacaaaaat gtatacggaa atatgcaaag gaactagagt agctaaaaca aatttgaaaa 26160 

agaaaaataa agtgggaaga atcagtctat ccagtttcaa gacttacata gctacagtaa 2 622 0 

tcaagactgt gatattgaca gagggacagc tatagatcaa tgcaaccaaa tagagaacta 26280 

agaaagaagc acacacaaat atgcccaaat gatttctgac aaaggtgtta aaacacttca 2 634 0 

acgggggaag atatgtctct cattaaaggg tgtagagtca ttgcacatct ataggcaaaa 2 6400 

agatgaacct gaacctcaca ccctacagaa aaattaactc aaaatgactc aaggactaaa 2 6460 

cataagatat acatctataa aacatttaga aaaaggccac gcacggtggc tcacgctcgt 26520 

aatcccagca ctttgggagg ccaaggcagg tggatcacct aaggtcagga gtttgagacc 26580 

agccggatca acatggagaa gccccatctc tactaaaaat acaaaattag ctggacgtgg 26640 

tggcacatgc ctgtaatccc agctacttgg gaggctgagg catgagaatc gcttgaaccc 26700 

ggggggcaga 99ttgcggtg agccaagatc acaccattgc actccagcct gggcaacaag 26760 

agcaaaactc caactcaaaa aaaaaaaaaa aaaggaaaaa tagaaaatct ttgggatgta 26820 

aggcgaggta aagaattctt acacttgatg ccaaactaag atctataagg ccagtcgtgg 26880 

tggctcatgc ctgtaattcc agcactttgg tcaactagat gaaaggtata tgggaattca 26940 

ctgtattatt ctttcaactt ttctgtaggt ttgacatttt tttagtaaaa aattggggga 27000 

aagacctgac gcagtggctc acacctgtaa tcccagcact ttgggaggcc ggggcaggtg 27060 

gatcacacgg tcaggagttc gagaccagcc tggccaacat ggtgaaaccc cgtctctacc 27120 

aaaaatataa aaaattagcc gggtgtcatg gtgcatgcct gtaatcccag ctactgagga 27180 

ggctgaggca ggagaatcac ttgaacctgg gaggtggaag ttgcagtgag ccgagattgt 2724 0 

gccactgcac tccagccttg ggtgacagag cgagactccg tctcaaaaga aaaaaaaaaa 27300 

aaagaatatc aaacgcttac tttagaaact atttaaagga gccagaattt aattgtatta 27360 

gtatttagag caatttttat gctccatggc attgttaaat agagcaacca gctaacaatt 27420 

agtggagttc aacagctgtt aaatttgcta actgtttagg aagagagccc tatcaatatc 27480 
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actgtcattt gaggctgaca ataagcacac ccaaagctgt acctccttga ggagcaacat 2754 0 

aaggggttta accctgttag ggtgttaatg gtttggatat ggtttgtttg gccccaccga 27600 

gtctcatgtt gaaatttgtt ccccagtact ggaggtgggg ccttattgga aggtgtctga 27660 

gtcatggggg tggcatatcc ctcctgaatg gtttggtgcc attcttgcag gaatgagtga 27720 

gttcttactc ttagttccca caacaactgg ttattaaaaa cagcctggca ctttccccca 27780 

tctcccgctt cctctctcac catgtgatct cactggttcc ccttcccttt atgcaatgag 27840 

cggaagcagc ctgaagccct cgccagaagc agatagtgat gccatgcttc ttgtacagcc 2 7900 

tacaaaacca tgagcccaat aaaccttttt tctttataaa ttatccagcc tcaggtattc 27960 

ctttatagca agacaaatga accaagacag ggggaaatca acttcattaa aataatctat 28020 

gcagtcacta aacaaataag aacaagaggc tccagaagtg ggaagccaat acccagagtt 28080 

cctiacaatac agtatctgaa aagtccagtt tccaaccaaa aaatatatat atacaggccg 2814 0 

gacatggtag cttatgtctg taatcccagc actttgggat gctgaggcgg gcagatcacc 28200 

ctacjgtcagg agttcgagac cagcctggcc aatatggcaa aaccccgtct ctactaaaaa 28260 

tacaaaadtt agccaggcat ggtggtggat gcctgtaatc ccagctactc gggaggctga 2 8320 

grjcag-jaaat cacttgaacc caggaggcag aggttgcagt gagccgagat cacgccactg 28380 

aactccagcc tgggcaacaa agtgagactc cacctcaaaa aaaaaaaaaa tatacatata 2844 0 

tatatgtgtg tgtgtgtgtg tgcgcgcgtg tgtgtatata cacatacaca tatatacata 28500 

tatara^aca cacatatata tatgaagcat gaaaagaaac aaggaagtat gaaccatact 28560 

ttctqrqgtt atgataggat ggggtatcac gggggaagta gacaagggaa actgcaagtg 28620 

agagcaaara gttatcagat ttaacagaaa aagactttgg agtaaccatt ataaatatgt 28680 

ccacanaatt aaagaaaagc gtgattaaaa aaggaaagga aagtatcata acaatattac 28740 

tccaaataga gaatatcaat aaaggcatag aaattataaa atataataca atggaaattc 28800 

cggaqttgaa aggtagaata actaaaattt aaaattcact agagaaggtt caacactata 28860 

tttgaactgg cagaagaaaa atttagtgag acaaatatac ttcaatagac attattcaaa 2 8 920 

tgaaaaataa aaagaaaaaa gaatgaagaa aaataaacag aatctcagca aaatgtggca 2 8 980 

caccattaat cacattaaca tatgcatact gagagtaccg gaagcagatg agaaagagga 2904 0 

agaa^aaata ttcaaatgat ggccagtaac ttcctagatt tttgttttaa agcaataacc 29100 

tatacaatca agaaactcaa tgaattccaa gtaggataaa tacaaaaaga accacaaaca 29160 

gatacaccat ggtaaaaatg ctgtaagtca aaaacagaga aaatattgaa agcagctaga 29220 

ggaaaactta taagagaacc tcacttacaa aagaacatca cttataaaag aaccacaata 29280 

atagaaacag ttgacctctc atcagaaaca atgaatgata acatatttga agtgctcaaa 29340 

gaaaaaaaat aaagattcct atatacgaca aagctgtctt tcaaaaatat acatccaaaa 294 00 

ggattgaaac cagggtcttg aagagttatt tgtacatcca tgttcatagc agcattattc 29460 

acaatagcca aaaggtagaa gcaacccaag ggtccatcga caaataaata aaatgtggta 29520 

tatgtataca caatggaatt tattcagtat taaaaaggaa tgaaattctg acacatgcta 2 9580 

caacatggzt aaaccttgag aacactatgc taagtgaaat aagccagcca caaaaggaca 2964 0 

aataccatat tacttcactt gtatgaaata cctagggtag tcaaattcag agatagaaag 29700 

taaaacagtg gttgccaagg gctgagggag ggagtaacgt ggagttattg ttgaatgggt 29760 

acagaatttc agttttgcaa gataaaaaga gttctggaga cagatggtgg tgagggtggt 29820 

acaacaatac aaatatactt tatactactg aacagtatac ttaaaaatga ttaacatggt 29880 

gaaaccccgt ctctactaaa aatacaaaaa aattagctgg gtgtggtggc gggcacctgt 2994 0 

aatcccagct acttgggagg ctgaggcagc agaattgctt gaaaccagaa ggcggaggtt 30000 

gcagtgagct gagattgcgc caccgcactc tagcctgggc aataagagca aaactccgtc 30060 

tcaaaaaata aaaaataaaa aaaatttaaa aatgattaag caggaggcca ggcacggtgg 30120 

ctcacaccta taatgccagc actttgggag gccgaggcag gcgatcactt gagaccagga 30180 

gtttgagacc agcctggcca acatggcaaa accctgtctc tgctaaaaat acaaaaatta 30240 

gccaggcatg gtggcatata cttataatcc cagctactgg tgagactgag acacgagaat 30300 

tgcttgaacc caggaggcag agattgcagt gagtcgagat cgcgccactg aattccagcc 30360 

tgggcgacag agcaagattc tgtctcgaaa aaacaaaaac aaaaacaaaa agcaaaacca 30420 

aaaaataatt aagcaggaaa cgagattgct gctgaggagg agaaagatgt gcaggaccaa 30480 

ggctcatgag agcacaaaac ttttcaaaaa atgtttaatg attaaaatgg taaattttat 3054 0 

atgtatctta ccacaaaaaa aagggctggg gggcaggaaa tgaaggtgaa ataaagacat 30600 

cccagagaaa caaaagtaga gaatttgttg ccttagaaga aacaccacag gaagttcttc 30660 

aggctgaaaa caagtgaccc cagagggtaa tctgaattct cacagaaaat tgaagcatag 3 072 0 

cagtaaaggt tattctgtaa ctatgacact aacaatgcat attttttcct ttcttctctg 3 0780 

aaatgattta aaaagcaatt gcataaaata ttatatataa agcctattgt tgaacctata 3 0840 

acatatatag aaatatactt gtaatatatt tgcaaataac tgcacaaaag agagttggaa 30900 

caaagctgtt actaggctaa agaaattact acagatagta aagtaatata acagggaact 30960 

taaaaataaa attttaaaaa atttaaaaat aataattaca acaataatat ggttgggttt 31020 

gtaatattaa tagacataat acaaaaatac cacaaaaagg gaagaagaca atagaactac 31080 
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ataggaataa cattttggta tctaactaga attaaattat aaatatgaag tatattctgg 3114 0 

taagttaaga cacacatgtt aaaccctaga tactaaaaag taactcacat aaatacagta 31200 

aaaaaataaa taaaataatt aaaatgtttg tattagtttc ctcagggtac agtaacaaac 312 60 

taccacaaat tgagtggctt aacacaactt aaatgtattt tctcccagtt ctggaggcta 3132 0 

aacacctgca atcaaggtga gtacagggcc atgctccctg tgaaggctct aggaaagaat 31380 

cctcccttgt ctcttccagc ttccagtggt tctcagtaac cctaagtgct ccttggcttg 3144 0 

tagctatatc attcctagca accagaaaga agaaaataat aaagattatg gcaaaaaata 3150 0 

atgaaatcaa aaggagaaaa atggaaaaaa ataaataaaa ccaaaagcta gttctttgaa 31560 

aagatcaacc aagttaacaa accttttaac tagactgaca aaaaggaggt aagactcaaa 3162 0 

ttactagaat cagaaataaa agaggggaca ttactaatga gggattagaa aagaatacta 31680 

cgaacaaatg tgtgccaaca aattagaaaa cttagatgaa atggacaggt tcctaggaca 3174 0 

acatcaacta ccaaaattta ctcaagaaga aagagacaat ttgaatgagc tataacaagg 31800 

gaagagactg aattgacaac caagaaacta tccacaaaga aaatcccagg cccagaagat 31860 

ctcactgtga aattctttca aacttataaa tataaattaa catcagttct tcacaaactc 31920 

ctccaaaaaa aagaacagat ctctatttac aggcgatacg atctttagaa aatcctaagg 31980 

gaactactaa gacactatga taactgataa acaagttcag caaggctgca ggatagaaaa 3204 0 

ccaanataca aaaatctatt atatttctat acacttgcag tgaacaaccc aaaaatgaga 32100 

ttaagaaaat aattcaattt acaataacat caaaaagaat aaaaacactc aaaaataaat 32160 

ttattcaagt aagtgcaaaa cttatactct agaagctaca aaacactgtt aaaagaaatt 32220 

aaaggtttac ataaatgaaa aactatccca tgttcatgga tcaaaagact tattactggc 32280 

rtatgctctcc aaattgatct ataaattcaa caaaatcctt atcaaaatcc cagatgaggc 32340 

tgggggtggc ggttcatgcc tgtaatccca gcactttggg aggctgaggc acgcagatta 32400 

cctgaggtcg ggagctcgag atcagcctga ccaacatgga gaaaccctat ctcttctaaa 32460 

aatacaaaat tagtcaggcg tggtggcaca tgcctataat cccagctact cgggaagctg 32520 

aggcaggaga atcgcttgaa cccaggaggc agaggttgca gtgagccaag atcgtgccat 32580 

Lgcaccccag cctgggcaac aagagcaaaa ttccatctca aaaaaaaaaa aaaaaaaatc 32640 

ccagatgact tcactgttga aattgaaaag attattctaa aattcacatg gaattgcaag 32700 

accrttgagaa tagccaaaac aaacttgaaa aacacgaaca aaatatagga tgactcactt 32760 

gccaattgca aatgttacga cacagcaaca gtaatcaaga ctgtgtggta ctggcaaaag 32820 

acacatacat acatacatat caatggaata taattgagag tacagaaaca agcctaaaca 3288 0 

tctatggtaa gtgcttttct atttttttct tttttttttt cttttttgta gagatagaat 32940 

ctcaccatgt tgcccaggct ggtcttcaac ttctgggctc aagcaatcct cccactgtgg 33000 

cctcccaaag tgctgggata actggcatga gccaccacat ccagcccaga tgattttcaa 33060 

aaaagtcaac aagaccattc ttttcaacaa ataggtctgg gatgatcaga tagtcacatg 33120 

aaaaaaaaaa tgaagttgga ccctccatca cactaaagtg ctgcgattat aggcatcagc 33180 

caccacatcc agcccaaatg attttcaaaa aggtcaacaa gaccattctt ttcaacaaat 33240 

aggtctggga taatcagata gtcacatgaa aaaaaaaatg aagttggacc ctccatcaca 333 00 

ccatatgcaa aaattaattc aaaaatgaat tgatgactta aacgtaagag ttacgactgt 33360 

aaaactctta gaaggaaaca tacgggtaaa tcttaaagac gttaggtttg acaaagaatt 33420 

cttagacatg acaccaaaag catgaccaac taaggtaaaa tagggtaaat tgtacctacc 33480 

aaaatgaaaa acctttgtgc tggaaaggac accatcaaga aatggaaagc caaaatagcc 33540 

aaggcaatat taagcaaaaa gaacaaagct ggaggcatca tactacctga cttcaaagca 33600 

acagtaacca aaacagcatg gtactagtag aaaaacagac acatagacca atggaacaga 33660 

ataaagaacc caaaaataaa tccacatatt tatagtcaac tgatttttga caatgacacc 33720 

ccttcaataa atgatactag gaaaactgga tatcgatatg cagaagaata aaactagacc 33780 

cctatctctc accatataga aaaatcaact cagactgaat taaagacttg aatgtaagac 33840 

ccaaaactat aaaactactg gtagaaaaca taaggaaaaa cgcttcagga cattggtcca 33900 

ggcaaagatc ttatggctaa aacctcaaaa acacaggcaa caaaaacaaa aatggaaaaa 33960 

tagcacttta ttaaactaaa aagctcctgc acagcaaagg aaacaacaga atgaaaagac 34020 

aacctgtaga atgggagaaa atatttgcaa actatccatc catcaaggga ctagtatcca 34080 

gaacacacaa gtgactaaaa caactcaaca gcaaaaaagc aaataatctg gtttttatat 34140 

gggcaaaaga tctgaataaa cattctcaaa ggaagacata caaatgtcac tatcattctg 34200 

ccagtaccac actgtcttga ttacttgtta gtgtataaat ttttaaattg ggaagtgtga 34260 

gtcatcctac actttgttct tgtttttcaa gtttgttttg gctattctgg gagccttgca 3432 0 

agtataaaat agccaacaag tatgaaaaaa tgctcaccat cactaatcat cagagaaata 34380 

aaaatcaaga ccactatgag atatcctctc actccagtta gaatggctac tatcaaaaag 34440 

acaaaatata atggatgctg gcaaagattt ggagaaaggg gaactcctat acactgtggg 34500 

tagggatgca aattggtaat ggccattatg gaaaataata ctgaggtttt tcaaaaaact 34560 

gaaaatagaa ctaccatatg atccagcaac cctactactg ggtatttatc caaaggaaag 34620 

aagtcagtat actgaagaaa tatatgcact ctcatgttaa ttgcaacact gttcacaaca 34680 
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gccaagacag ggaataaatc taaatgtgca tcaacagatg aatggataaa gaaaatgtgg 34 740 

catatacact caatagaata ctattcagcc attaaagaag aatgaaatcc tgtcatccca 34 800 

gcaacatgga tgaacctgga ggacattata tttaatgaaa taagtaaagc acaaaaagat 3 4 860 

aaacagtaca tgttctcact cagacatggg tgctaaaaag aaaatggggt cacagaatta 34 920 

gaaggggagg cttgggaaaa gttaatggat aaaaatttac agctatgtaa gaagaataag 34 980 

ttttagtgtt ctatagaact gtagggcgag tatagttacc aataacttat tgtacatgtt 3 5040 

caaaaagcta gaagagattt tggatgttcc cagcacaaag gaatgataaa tgtttgtgat 3 5100 

gatggatatc ctaattaccc tgattcaatc attacacatt gcatacatgt atcaaattat 35160 

cactctgtac ctcataaata tgtataatta ttacgtcaac aaaaaaagga aaaaaaagaa 3 522 0 

aattaagaca acccacataa tggaagaaat aaaatatctg caaattatat atatctgata 35280 

aatatttaat atttataata tataaagaac tcctacaact caagaacaac aacaaaacaa 3 5340 

cccaattcaa aaatgggtaa aagccttgaa tatacactta tctaaagact atatacaatt 3 54 00 

ggccaataaa gacacgaaaa gatgctcaac atcactagtc atcagggaaa tataaatcaa 35460 
aaccacaatg tagaatgtag acaccacttc atatgcacta ggatggctag aataaaaagg • 3 5520 

taataacaaa tgttggtaag gatgtgaaaa aatcagaaac ctcattcgct gctgttggga 35580 

atgtaaagtg atgcagccac tttggaaaac agtctggcag ctcctcaaat tattaaatac 3564 0 

agagttaccg tatgacccag gaatattcct cctgggtcta taaccaaaaa aatgaaaaca 35700 

tatatccaca taaaaacttg tacatgggca tttatagcaa cattattcat aacagcaaag 3 5760 

gtggtaagaa cccatatgcc catcatctga tgaacaggta aataacatgc ggtattatcc 35820 

atacactaga atattatctg cccatacaag gagtgacatc cagctacatg ctacaaggat 35880 

gaatctcgga aaccttatgc taagtgaaag aagccagtca caaatgacca cagattatga 35940 

ttccatgcat cggaaatgac cagaataggg aaatctatag agacagaaag tagattagtg 36000 

gttgggtggg gctgggagga caggtagtac actactttcc cagaactact ggaacaaagt 36060 

accacaaact ggggagctta aacatagaaa ttgatttcct cacagttctg gagactagga 3 6120 

ctctgagatc aaggtgtcag cagagctggt tctttctgag ggccctgagg caaggctctg 36180 

tcccaggcct ctctccttgg ctggcaggtg gccatcttct ccctgcgtct tcacatcatc 36240 

ttttctctgt gtgtgcccat gtccaaattt tgattggctc attctgggtc atggccaatt 3 6300 

gctatgcaca aagtgaagtc tacttccaaa agaagggaag agggaacact gactaggcta 36360 

aacttatagt cattttaatg tccgcttttc ctatgagatt gtgaacacac agaagtaggg 3 6420 

tttttatcta cattgtgcaa agtttaataa gaaaaatagr. attcaagaga agcagttcaa 36480 

tagcaggaat ttaatatggg aactaattac aaggtttagg gcaggactaa aaagccagtt 3 654 0 

gggatggtga gccaacccag agattagcaa cagtgggacc ccatctacct accacccatg 36600 

aagctggaag gataaaggag gggctattat cagagtccac aagccagtgt cagagtcctt 36660 

ggctggagct gggaccaccc tagagacact gtgcaaagca gaaaacaagg gggaaaaacc 3 6720 

ctgacttctc ccttcctccc acctttcaat ctcccactag tgcttcctac tagccatact 36780 

tggccagaga cagtgacaag gaacactgca aaatgaagtt tgtaggaatc atctccctct 3 6840 

gagacagaga aatatggaag ggtagaaaat gaatcagagg ataaagagaa aaaaccctga 36900 

gtactatctt atttatcttt gtatctccag tgcctaatct gtctctcaaa aaaggaaagc 36960 

aattgagaga aactgaaaac tccaattgaa atgaaagaat ggagaattac tggactagaa 37020 

gagaagagaa aaatttattc cgcatagagt aaacaagaat ggattcacaa aggacgtgat 37080 

gaatgaaaag ctataatcag caaagatttg ccagagaaat taaaaagtgg taaactcagc 3714 0 

cacgctgtac aacctgaagg cacaatgcat gaaaacgttt caagaaatga caagatttga 37200 

agtcaaattc taagtgcttt tccagaatct ctcaagacga ttatatagct accccatttt 37260 

attaaataaa atggaaactt actaaacttt ccccttgtat taaactaaca tatgtcctaa 37320 

tagcaaacga ttctggaatt cctagagtaa aatatatttc gtcaaagtgt attgctcttt 37380 

taatattctg ctgacctcct tttgctattt aggatatttg tatacacatc acacgtaaat 37440 

ttggtctata gtttacatct acgggcttat actgttcttt ttttcatttt tttaaaattt 37500 

ccaaccccca gtatccatat actgctctct atcagggtta ttttaacttt gtaaaatcag 37560 

ctgagatgct ttccatgttt ttttttttta ttttctgcca catttgaata gcataggagt 37620 

taccaccatc aaccttggat tatttaagca ttcacgattc cacgtgtgga ttttttattc 37680 

agagtctttc ttgtcattcc tgctatcagc acagaaccca atctcagctt tccagctata 3 774 0 

ctctcacccc atggaatttg cagatgaagt tcaaaaggac ctttgcatta tcctgcctcg 37800 

ccctcttccc ccttcattta gacatcacct tcttctagaa cgtcttacct gacatgccct 37860 

gctcccaacc cctgctgccc aattgtgtgc tctcccgtgt cctggcctgc catcctcttt 37920 

agtaattgcc tgctccctca tctgtctccc cacccagaca ttaagctgaa tagactggat 37980 

ttgtgtcttg tccatcacta taatctcagc acctagtacc tagtaggtac ttaccatgta 38040 

ttcattagca aaatgttatg tataaccttg caccttaaaa acaagagaag gaagacaaaa 38100 

ttaagtctta agactatggt ttagaacatg gatcagaaac tacagtctgc agcccaaatc 38160 

cagaccaaat gaagagacca tgttcattta catacaacct atagcagctt tcacactaca 38220 

ggagcagagc taagtagttc caagggaaca cacggccctg caaagcctaa aatatttact 38280 
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ctatagctct tcacagaaaa agttttcaga tccctcgttt agaactcttg ttcatatgca 38340 

atttcactaa accatagttt tttgggtttg tttggttttt tttggcaaaa aggaatgagc 3 84 00 

cgacccagaa aaggttgaaa agaatgaatc attactgctg aaagaatgtg cacacagtcc 3 8460 

gtcagcattc tgctgccatg ctgacaccca tccaatagtg tcatgagatg cagcagctac 3 8520 

tactgtgttc tcaatgccga gtccacccac tccataacca tgtccaagca atcttgggaa 3 8580 

catcatcacc atgcttgttt atccttaagg tattgcctca catacagcag tggctggtca 38640 

taaagtcaaa tgacactagt ggccaggagg tcaagagaat gagtgaggac aggtgggtag 3 8700 

gcagcccagg ccctagcaac agcaggagct cacccctcag tcactctagc caggactgaa 3 8760 

atacttttca ccctttcaag agagactagg aatctggatt tttatgtgaa atatcttgat 38820 

cactaaatgt tgtcaacaga catgtcaaaa ggtaaaacta agtaagttca tggggcagat 3 8880 

tgactartca ggttatagaa ttaaggattc ttatccaaca cagataccaa ccaaaaagct 3 8940 

gacgtataac atattaggag aaactatgtg cactgtcgaa acatcaacaa ggggctaatg 3 9000 

tctaaaatag tctatattgg attccagttg aaacatgggg aaaggacatg aacaggcaac 3 9060 

ttatgtcaat ggaaactcaa aaagataaca agcatatata aaagcattct caaattcagt 3 9120 

agtaaacaga cagatgcaaa taaaaagagg gaaactgctg ccgggcacag tggctcacac 3 9180 

ctgtaatccc agcactttgg gaggccgagg cgggcggatc atgaagtcag gagatcgaga 39240 

ccatcctggc taacatggtg aaaccccgtc tctactgaaa acacaaaaaa ttagccaggc 393 00 

gtagtggtgg gcaccagtag tcccagctac tcaggaggtt gaggcaggag aatggcatga 3 9360 

acccaggagg cggagattgc agtgagccga gaccatgcca ctgcactcca gcctgggcga 3 9420 

ctgagtgaaa ctccatctca aaaaatataa taataattat aattataata ataataaata 39480 

gtaaataaat aaaaagagag agactgctaa agtctagaaa gttgaatgat gccaagcgca 3 954 0 

tgcaaagatc agggccttgg gatggccggg tgcagtggct cacgcctgta atcccaccac 3 9600 

tttgggnggc caaggcgggc ggatcatgag gtcaagagat caagaccatc ctggccgaca 3 9660 

cagtgaaacc cggtctctac taaaagtaca aaaaaatata tatatatata tatattatta 3 9720 

tattatatat atatatatca gagccttggg aatccttgtg tgctgctggg gaaggtagtg 3 9780 

gtgcagccac ccttgacagc aatctggcag tacttggtta tattaagtat aggcacacac 3 9840 

cacgaccagg cagtcctact cctgggtcta aatcccaaag aattctcaca caagtccata 3 9900 

aggagacacg tacgaggctc attcagcatt actgggagtg ggaatcaacc tgggtgtcca 3 9960 

tctacaggag acgagatgga caaaatgtgg tggatattaa gaccagaatc accaagtaac 4 0020 

agagatgggt ggtgagtgac aatcctaaga tacagaataa aggctagaac atgatgccat 4 00 80 

tcatgtaaat taaaaataga tgcacacaaa gcagtatacg cgtgaccctt gaatagcaca 4 0140 

ggtttgaacr gcctgtgtcc acttacatgt ggattttctt ccacttctgc tacccccaag 40200 

acagcaagac caacccctct tcttcctcct ccccctcagc ctactcaaca tgaagatgac 40260 

aaggatgaag acttttatga taatccaatt ccaaggaact aatgaaaagt atattttctc 4 032 0 

ttccttatga ctttctttat ctctagctta cattattcta agaatatggt acataataca 40380 

catcacacgc aaaataaatg ttaattgact gtttatatta tgggtaaggc ttccactcaa 4 0440 

cagtaggccg tcagtagtta agttttggga gtcaaaagtt atacacagat tttcaactgt 40500 

gcaggcaatc agttcccctg accccctcat tgttcacggg tcaactgtat atacacaaaa 40560 

gtattatatg aacctcatta gaatagctgt ctatagggag aagagaatga gagtgggata 4 062 0 

aaacggaatg aacaaataaa ccaacaaatg cattaacaag caaaacaaca gaggggcttg 4 0680 

catgggccag tgatgataaa gggctaagaa tgagaatata attaattcaa ttcctcacac 4 074 0 

ctgaggtcta aaaccaagga aagggagggc caggcgtgga ggctcacgcc tgtaatccca 4 0800 

gcactttggg aggctgaggc gggcggatca caagattagg agtttgagat cagcctggcc 4 0860 

aacacagtga aagcccatct ctacaaaaaa tacaagaatt acccaggtgt ggtggcacat 4 0920 

gcctgtagct agctactctg gaggctgagg caggagaatc acttgaaccc aggaggcgga 4 0980 

ggttgcaggg agccgagatc acaccattgc actccagcct gggtgacaga gtaagactct 41040 

gtctcaaaaa aataaaaaaa ataaaaaaac agagaaaggg aggaaactag atccaggctg 41100 

actagataca gcctttagag ttagaaaaga tgatttgaca atctaagccc acactcagat 41160 

tgaatgaaat tgaaaagcct ttcaaactaa aacatttaat tacaccatct gctgcagaca 41220 

gaactcagac aactcaaaca ggtaatgtca gcgtggtgtt ttatatcacc accctcaaca 41280 

cagaataaaa atcagctgca tgtgaagcag tgactagaat gaagaaaagg ctgcttctta 4134 0 

cttccttcta gtggttcttt ccgaaaacat taataggcac cagctctatg catgtcaccc 41400 

tgcagggaga catggggtat ataactatga cttactgttc attcctcaag gaattcccaa 41460 

tcttgtggaa gattatacac aatgaggcaa caaaaactat ccaataaaac cacggaaaag 41520 

aagccagtga caaagaagcc agtgatgaaa ggccctgtga gcagagctga tggccatttg 41580 

gggaagaaag accaacatgg atgggggtga tcagggtggc tccgtgggaa agctggaaga 41640 

gaagtggcag atctctgagc tggatgatgg gccactacca tctgtatatg gctaattaaa 41700 

gaccatgtgt ggatttttta ttcagctctt tcgtgtcatt cctgctatca gcacagaacc 41760 

caatctcaac tttccagcta tattgagcta aacttctcac ctcatggaat ttgcagataa 41820 

agttcaaaag gatccttgcc ttttcaaaat aattttgaat ggttgagtag tccctctgtg 41880 
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ctctctcact gacaccctct caaggctgct gagcacgtgc catgctatgg ctttctccaa 4194 0 

catcaggaaa tgttctccac tcagtttcac cttaatacaa atgtgttctc tcttcagaga 42000 

aggcaaaaaa attcatgacc atctgactgg gagaagtcat ttctaggtaa agtgtccatc 42060 

tttttctgag gaacacagga ggaaaatctt acagaaaaga gttaacacag caggcctaag 42120 

actgcttttt aaaataaata aataaataaa taaataaata aataaataaa taaataaata 42180 

aataaatgaa tgatagggtc ttctgtattg gccaggctag tctcaaattc ctggcttcaa 42240 

gagatcctcc caccttggtc tcccacagtg ttgggattat agacatgagc cattgtgctt 42300 

ggcccaagac tgttattctt aaaaagtctc ataaaaagca tggttaatcc ttggctggca 423 60 

cctgggaact tagatttcag aagggttccc accatccaac ctggaaagag ggactcactg 4242 0 

tgcctaaatt attgtgtggt ttatgctgaa ctcctgcttt tcttcaggta gcgtggaatg 42480 

tggtatgtgc tgggcaaagg gggcctgcat gaccagcccc caataaaaac cctgggtgtt 42540 

gggtctctag tgagtttccc tggtagacag catttcacat gcgttgtcac agctccttcc 42600 

tcggggagtt aagcacatac atcctgtgtg actgcactgg gagaggatgc ttggaagctt 42660 

gtgcctggct tcctttggac ttggccccat gcacctttcc ctttgctgat tgtgctttgt 42720 

atcctttcac tgtaataaat tacagccgtg agtacaccac atgctgagtc ttccaagtga 42780 

accaccagat ctgagcatgg tcctgggggc ccccaacaca gaaataaatt ataaaagacc 42840 

a.iggactggg catggtggcc catgccggta atctcagcgc tttgggaggc cgaggcagga 42900 

cj^accagtta agcccaaaag ttcaaagtta cagtgaccta tgactgcgcc aatgcactct 42960 

aacctgggag acagagcaag accctgtccc caaaacaata aactaaacac atacttctgc 43 02 0 

cttccaagtg tcttaaaatt caatggaatg gtagaaacat ttttaaaaca ctaaatcaaa 43080 

agaaacctgg aaaacaagag tgccgatggc caactaaaat gtctaggaaa tttctgaaaa 43140 

gt-aaaaagta ctcagaacca gattacctga gcaaaccata gcccaataca agcttgggag 43200 

gaggctgtta tgcagaagga aatggtaaca ggtttccagg aacagacttg taacagcaga 43260 

tagaacagca gaggtagaac ctgacaaggt gattacctgg ggaactgcag tctgaatgac 43320 

caggactgtt ggacccttcc cctcacatgg aatacacacg ccactcagca gcacaccaca 43380 

gctcttcaac aatcacagga ggcacgctac gcctagtaag acaggaaaaa aggaattctc 43440 

aaacrtcgaa gatgaacaca taaagaatca ccaagttttt attcagtatg atgaaacagg 43500 

gacactgaat caacagaaca caaacccaag caaagataat tactagagca catagaagaa 43560 

attattagat attcttggga agacctaagg ggacattata aagagcaagc agttggtatg 43 62 0 

tgacgatctt tgtgatatac caagaaataa aaacacagga tgaagaccag atagagaata 43680 

atgccactat ttgtgcaaaa aaggagaaat ggagaatctg attcatattt gcttgtattt 43740 

gcatgaagaa actttggaag gtacataagt aactaacaac aatggttacc tacttgtaag 43 800 

gcgagagaag taagaggaca ggaatggtgg gaacaccttt tgtgtccgga attggtgggt 43860 

tcttggtctg acttggagaa tgaagccgtg gaccctcgcg gtgagcgtaa cagttcttaa 43 92 0 

aggcggtgtg tctggagttt gttccttctg atgtttggat gtgttcggag tttcttcctt 43980 

ctggtgggtt cgtagtctcg ctgactcagg agtgaagctg cagaccttcg cggcgagtgt 44 04 0 

tacagctctt aagggggcgc atctagagtt gttcgttcct cctggtgagt tcgtggtctc 44100 

gctagcttca ggagtgaagc tgcagacctt cgaggtgtgt gttgcagctc atatagacag 44160 

tgcagaccca aagagtgagc agtaataaga acgcattcca aacatcaaaa ggacaaacct 4422 0 

tcagcagcgc ggaatgcgac cgcagcacgt taccactctt ggctcgggca gcctgctttt 44280 

attctcttat ctggccacac ccatatcctg ctgattggtc cattttacag agagccgact 44340 

gctccatttt acagagaacc gattggtcca tttttcagag agctgattgg tccattttga 44400 

cagagtgctg attggtgcgt ttacaatccc tgagctagac acagggtgct gactggtgta 44460 

tttacaatcc cttagctaga cataaaggtt ctcaagtccc caccagactc aggagcccag 44 52 0 

ctggcttcac ccagtggatc cggcatcagt gccacaggtg gagctgcctg ccagtcccgc 44580 

gccctgcgcc cgcactcctc agccctctgg tggtcgatgg gactgggcgc cgtggagcag 44640 

ggggtggtgc tgtcagggag gctcgggccg cacaggagcc caggaggtgg gggtggctca 44700 

ggcatggcgg gccgcaggtc atgagcgctg ccccgcaggg aggcagctaa ggcccagcga 44760 

gaaatcgggc acagcagctg ctggcccagg tgctaagccc ctcactgcct ggggccgttg 44820 

gggccggctg gccggccgct cccagtgcgg ggcccgccaa gcccacgccc accgggaact 44880 

cacgctggcc cgcaagcacc gcgtacagcc ccggttcccg cccgcgcctc tccctccaca 44940 

cctccctgca aagctgaggg agctggctcc agccttggcc agcccagaaa ggggctccca 45000 

cagtgcagcg gtgggctgaa gggctcctca agcgcggcca gagtgggcac taaggctgag 45060 

gaggcaccga gagcgagcga ggactgccag cacgctgtca cctctcactt tcatttatgc 45120 

ctttttaata cagtctggtt ttgaacactg attatcttac ctattttttt tttttttttt 45180 

tgagatggag tcgctctctg tcgcccagac tggagtgcag tggtgccatc ctggctcact 4524 0 

gcaagctccg cctcccgggt tcacaccatt ctcctgcctc aacctcctga gtagctggga 45300 

ctacaggcaa tcgccaccac gcccagctaa ttttttattt tatttttttt ttagtagaag 45360 

cggagtttca ccatgttagc cagatggtct caatctcctg acctcgtgat ccatccgcct 45420 

cggcctccca aagtgctggg attacagacg tgagccactg cgccctgcct atcttaccta 454 80 
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agagaaaaaa aataagttaa ggatgcagaa aacctgaatt accatcaata aacttgagat 4 914 0 

taaiatagaa ctgtataccc aatatactaa gagttcaggg aacagtcgtg actgacagtg 49200 

gactgcaaat taatctgttc ttaatctttg tttttctttc agcactgtgg cagaatagag 49260 

atcccaaaaa ccttccagct acaaaacatc tttttaaaaa tataaaaaaa tacaaaaata 49320 

acuctgaaat caatagaaga cacatggtga aaccaaaatt ctagaataca gggagaataa 49380 

aggcattttc agatattaca aaaacagaaa attgatcatt gctgaagtaa tttctaaaga 49440 

atgtacttga gggagaagaa aaatgttcca aagaaaagta tctgtgatac aagaaggaat 49500 

ggaaagtgaa gaaatggtaa acaggtagat aaagctaata aatgttgacc tagaaaataa 49560 

caaaaacaat agcaataatg tctcgttgga agggttgaag taaaaataca attaaggcca 49620 

aatgtgaggt aagtggaatg aaagaattag aagtccttgc cttgttcaca ggactgatta 49680 

aataaatgag ccaggttttc cattcaaaca gttaaaactt gaacaaaata aactcaaatt 4974 0 

aagtagaaag ataaaaaaca gaaattaatg tcatagaaaa ataaaaaatc aatagaatta 49800 

atcaataaac cctggttaat aaaagctggt tctttgaaag gattaataaa ataatcatta 49860 

agcaagnctg atcaaaaaaa aagagaaaag gtaccaaaaa aagtactgta tcagaaagag 49920 

aacatacaga tacatacaga tatgtaagag tctgttttct tacaccagaa tactatatac 49980 

aacatitatgc tagcatatat taaatttcaa taatgttaat gattttctag gaaaacagaa 50040 

aatattaaat ttactttgaa gaaacagaaa aactgagaaa aataaatgat catgaaaaaa 50100 

atgaaaaggt aattaaatac tgatattaac tgcctaaaca acaccagcag cagcccaggc 50160 

agtctgcagt caagttctgc caaacttgag ggaacagata attcttctat tccagagcat 50220 

agaaaatga: ggaaagtttc ccaatttaat cagagaggac agcctgatcc ttgttatgaa 502 80 

cacagataaa aatggggtaa actatatgcc aaactcagat accaaaaccc taaataagat 50340 

gctagcttat tgatgtgaac aatccaaaag tgcattttaa attagcccag ggttttagag 50400 

aaagaaaatc tagcaatgtg accaccactt atgttaacaa ttttaagacg aaaatctaca 50460 

tgatcatatc aatgcatgct acacaaaagc atttgggcaa aaaacccaac acccaccctt 50520 

gactttttaa actcttagta attaggcata aacagaaatg tacttaatgt gatagaatac 50580 

actcggtgaa gatacagagg gaatgctccc taaaaccaag cccaagacaa agattcctat 50640 

ttaaccicaa tagtcaacac tgcagcgaga gtaatctatg gaagacaagg aaaaaagtaa 50700 

aaacatgaga gacatctgtt gtttaacaga caataagatc acctacttgg aagaggcaaa 50760 

cgaatcaagc gaaaaactat taaaactgag acaggcttta gtatggaggc tcagcttcag 50820 

ctgtagtttg ggctaccaaa ttcaactcgc ttgcttggag agttaatcct gcaaagctaa 50880 

tttctgttga ggtattagga ttgacaagcc tgtgctcctc cctcctcccc catcttcaac 50940 

actgaaataa cacggtgttt ggaactggat aacagaatct tccaaaaaca aaaattgtcc 51000 

tgaagggccg acttgtgccc ttactcaaaa aacactttat ctgctgcctg cagctcctac 51060 

agttgctggt ggataagcct gccaaccagc tcggcgtaat tcttcctgca gagggcaagg 5112 0 

aagagcactt tcacaggaaa atttttttcc gaactgtatg ccgcttatta cataaactta 51180 

cgtgctggca aatggagctc cagcaaaata agatattcag agtcaaactt ccttaggaaa 5124 0 

aaaaaaaaaa aaaagcaagc acataacact aatttccttg catgggcact ggggaaggag 51300 

gtcgttactt ccgcacgccc gcaggtccgc accaccggga aacccacggg caccgcgcgc 51360 

tgcccccggg ccttccaggt gcactgcgcc gcggcgcccc agctgacccg ggatgcgcag 5142 0 

ccctagccct tcccctgtca ccccggccag gaaggggcgg gagcgcggcg gacgccgagg 514 80 

gcgaagggct tctcggtcct ctgcaccacg cagcaccccc aaggcacaac agggagggtg 5154 0 

cgggaggctc ccgagaccca ggagccgggg ccgggcgtgc ccgcgcacct gtcccactgc 51600 

ggcgagggct ggggtcgcct ccagggccgc agctgtcggg agccacctgg ctctcagtcc 51660 

cgggtccctg cgacaaccct cgggcccgga ggggaggagg cggccacctg ccgctgccac 51720 

ctgcggcacc ggtcccaccg ctccgggccg ggcaggacag gccaggacgt ccctcctggg 51780 

ctggggacag gacacgcgac gaggggaccg gggcccccgc ggcgaagacg cagcacgcct 51840 

tcccagaaag gcagtcccgt gcccccacga cggactgccg gacccccgcg ctcgcccgcc 51900 

catcccttca gaccacgcgg ctgaggcgca aagagccggc cggcgggcgg gctggcggcg 51960 

cggctagtac tcaccggccc cgctggctca gcgccgccgc aacccccagc ggccacggct 52020 

cc 9ggcgctc actgatgctc aggagaggga cccgcgctcc gccggcgcct ccagccatcg 52080 

ccgccagggg gcgagcgcga gccgcgcggg gctcgctggg agatgtagta cccggaccgc 52140 

cgcctgcgcc gtcctccttc agccggcggc cgggggcccc ctctctccca gctctcagtg 52200 

tctcatctcc ctatctgctc atcctctggt cgcacataat cgatgtttgg gcgtcccaag 52260 

ccagatgtgg accccatttc cgcactctac actggaggtt ttctaagggt ggtgcccgga 52320 

ccagcagctt cagcctcatc tgggaacttg agaaaatgca gattctccgt cccacccagc 523 80 

ctattcggtt tttcctgcac taaaaccatg aaggtggggc ccagcagtcc acattctcgc 52440 

aagcccgtca agtgattctg aggcgccctc cagtttgaga gctatgctca cggcctcacc 52500 

tccgccccgc aaggagcccg gtcttgcctg tggcgctagc cgcacacgga cacctcatcc 52560 

tgcggggccc gcccccccgc tgcaccctca ccgcccaacg cctcctccgg gatgcagcgg 52620 

aggcgcctgg aagtcggcaa ggtcaacatc cccctcagca tcttccctac cctcacggct 52680 
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cctcctccag gggtgcctca tggccagggg ttagaaagag ccactgtgtt tcttgacatg 5274 0 

gaagtggcct aagaccttaa tgaaaactgc aggagtggaa tgacagaacc tttggtcata 52 8 00 

cttgagggcg tgaagctcaa atgaggagga aggaaaggat ccagggagaa taaccaaccc 52 8 60 

tggcaagttg tggcgcccag gtagaggggc gagcctaggc tagcggttct cgaccagggc 52 920 

cggtgttgcc cctcctcgcc gccccgcgta catttgggga ggtctggaga catttttggt 52980 

tgtcatgatg cgggagttgc tactgttgcc taagtgggta gacacgaggg tgctcctcaa 53 04 0 

catcctacct gaaggacagg actgccccac aaggaagaat gatccggccc caaataagaa 5310 0 

accctgggct ggtcagcaac aacccctttg ttctgagaag agaggaggaa agaataaaag 53160 

aagtggggtg aagttttggt ttggtagagg aaacttgaag acattttcac tggaaaggaa 53220 

gagaggaaga ggagggagat gtctgtaagg acgagcaaac cgggtgacag ctgatttcct 532 80 

catattgaag taatgagtcc tagttataat aaattcctaa taaaaaccca gtttatccct 53 340 

gcaataaact tgtctttttt ttttaaatat actgcttgat tctgtttgct aatattttat 53400 

ttacaggctt tgcattgata tgcaaaaatg agatgggcaa taattttctt tttgaatgtc 53460 

taatgttgtt tggtttcaga atcaatgtta tgctcacatc ataaaaaatt tggaaccgag . 53520 

gcaggaiggag tgcttgaggc cagaagttcg agaccagtct aggaaacaca gtgagacccc 535 80 

cccatctcta caaaaaaaaa aaaagaaaaa aaaatgggca tgtttgcttt ttccttttac 5364 0 

tctgaacaat ttaaggagca ttaaaattat ctattctttg aggtttgatc atttcccagt 537 00 

taaaaatgtt cctcccagcc tgatgctttc tttggggagg gtaaatcttt taaggctaga 53760 

aaagtttctt ctgtggcaat tttattattt acattttaaa aattattcta gagttaattt 53 820 

tgataaagca tgtatttctt aaaacaaatt atcctttttt tccagatgtt caagtgtatt 53 880 

tgcataaagt tgaggaaagt agtcttttgt gaatctttta acttctccca aatatcttat 53 940 

tttgtgtatt tttgcfctctt tattttgtta acttttaaaa gtgtattttt ttttcaaaga 54000 

atcagctctt aggtttatgt ttttggttat actggagctt ttttcttctt ctttttaaaa 54 060 

tattttttct cctttatttt ttagacgtat tttgatctaa cgtaatcgga agaaggtaaa 54120 

ttagaatctt ttgttactat tgtgttttta tttctcctta tttctctgaa gtcctgcttt 54180 

ataaatagta ccatgttatt tgtgcataaa tattcatttg tcttatattc ttgggaattt 5424 0 

tcccacttca tcataaaatg accttccttg tctcatttaa tgtgttcaaa ctttgccctg 543 00 

aatttaactt tgtctgatat tttaccatcc tgctgaattt tgtttgttac cccaaacaac 543 60 

ctttgctgtt ttcgtctttt ctgaaccctt tattttaggt aatcccttga attagagcac 54420 

taagttttgc tttgtgatta aatctgaaaa tctttatctt gccatagatg agttgagccc 544 80 

tattcatgtg acagctatat tatgctgttt catagccctt ttggtccttt tttcactctt 54540 

gcattgcata ttttgtgttt attgtgtttt gtgtttcttc tgataatttg gaaggtttgt 54600 

atttttattc agggagttgc cttataatca tactccgcaa tacacatcgt cctcagtttc 54660 

ttcagactgt ctgttaactc cctattctga ataaaaatga cattgtaatt tccctctttt 54720 

ttctttaccc cttttcttct cctcacctaa tgtaaatgat tttatccttc tttagtattt 54780 

gcttttttaa ttaactacat ttataaatat ctttatcact tgatttttaa atcagctttg 54840 

aatgagatat ttggattcct agatataaaa gatgttaatt ataccatttc cacgttagta 54900 

ggtttataaa atcatacatt ctgctgtgta accataatcc cacgtttgtt ttagttccac 54 960 

tcctacagtt aaaagattca gaagtattat taacagttat tttgccatag ttttttcccc 55020 

aacccatttt gtggtaagtt atgatcctgc tttagtttct taagaataat ttatagagca 55080 

gagtgtggtg gctcacgttt gtaatcccag cactttggga gacaagaggt agaaggatcg 5514 0 

cttgaagcca gcagttcaag accaccctga gcaacatagt gagaccttgt ctctacaaaa 55200 

aattttaaaa tttagccaga cgtagtggcg tgtgcctata gtcccagcta ctcaggaggc 55260 

tgaggcaaga ggattgctag agcccagaag tttgaggctg cagtgacctc tgattgtgcc 55320 

actgcacccc agtctgggca agaaagtgag aacctatctc tttaaaataa caataataac 55380 

ttatgaaaat tatattccct gagtttttca tgtttaaaaa tatttgttgc ctttatcctg 55440 

taaaagtttg agtataaatt cttgggttat actttattta ttgaagaatg tataagtatt 55500 

gtcttctaga attgagtgtt gctgtaatga aaccagaagt cagcctggtt tatttttcct 55560 

cagaaatgag gtaattgccg gccggacacc gtggctcatg cctgtaatcc caacactttg 55620 

ggaggccgag acaggtggat cacgaggtca ggagattgag accatcctgg ctaacatggt 55680 

gaaaccccgg ctctactaaa agtacaaaaa gttagctggg catggtggtg gacgcctgta 55740 

atcccagcta cccgggaggc tgaggcagga gaatggcgtg aacctgggag gaggagcttg 55800 

cagagagctg agatcgcgcc actgcactcc agcctgggcg acagagtgag actccgtctc 55860 

aaaaaaacaa aaaaaaaaca aagaagtgaa gtaattgcca tgatgctcca agaattatct 55920 

ctttgtctat gaaatccaga aatctcactg ttatacattt tggaattatt attctgggcc 55980 

aatatttcct gggacacaat agattgactc tatagattta attttttttt tttttttgag 56040 

acagagtctc actgcaatct cagcttactg caacctctgc ctcacgggtt caagcaattc 56100 

tcctgcctca gcctcccaag tagctgggac tacaggcgcg tggcaccatg cctggctaat 56160 

ttttgtcttt ttagtagaga cagggtttca ccatgttggc caggctggtc ttgaacgcct 56220 

aacctcaagt gatccacctg cctcagcctc ccaaagtgct gggattacag gcgtgagcca 56280 
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ccatgcccag cctcaattcc tctttctatc tggtaatttt tctgaagttg aaaacatttg 56340 

ttctaatacg ttatttcagt gttcttctaa gatgtgtaaa gcaccctatt cccaggtcag 56400 

cccccatctt gctagtgagc tcggctggtt cttcacaaga gctctggttt tctcctgctt 56460 

aatctcaagt acctctgtca gcctccacct ggtttatgat ttggagtttt ttggtttttg 56520 

ttttttgttt ttgacagagt cttactctgt cacccaggct ggagagcagt ggcataatct 56580 

cagctcactg caacctctgt ctcccaggtt tgagcgattc tcctgcctca gcctactgag 5664 0 

tagctgggat tacaggcgcg tgccaccaca cccggctaat ttttgtattt ttagtagaga 56700 

tggggtttca ccatgttggc cagggtggtc ttgaactcct gacctcaggt aatccacctg 56760 

cctcagcctc ccaaagtgct gagattacag gcgtgagcca ccgcgcctgg catggtttgg 56820 

agttttaatc tgtagtttta ataaagatag tgcttatgtt tgtgtttctt atatttcttg 56880 

gtactcttgg gtaatttgta agatccccat atctacacaa gaagtccatt ttcaattctt 5694 0 

ttcttcagac tgtttatttt attttatttt attttatttt tatgtttgag atggagtctc 57000 

gccgtgtcac ttctggaggc tggagtgcag tggcgcgatc tcaggtcact gcaacctccg 57060 

tctcccgggt tcaagcaatt ctcctgcctc agcctcccga gtagctggga ttacaggcac 57120 

ctgccacttt ttaatttttt tagagacaga gtctcgcttt gttgaccagg ctggagtgcg 57180 

gtggtgcaat catggctgac tataacctcc aaatcctggg ctcaagtgat cctcctgcct 57240 

cagcctcctg agtagctggg actacaggca catgccacca tgcccagtta attttaattt 57300 

ttttgtagag acagggtctc catatgttgc ccaggctggc ctcctactcc tggcctcaag 57360 

taatcctcct acctcagcct cccaaattac taggattata agcatgagcc accatgccca 5742 0 

gccttgttct actactttaa tttcatatgt taggtgacca tgtaattgat catccaaacc 57480 

aggatactgt aagaatgaaa gaggctgaca gtagtatgat gctgggacta gcattgtgca 5754 0 

ctgagattat ttctgggaaa gcaggagata cggtcaccct acttatagtg tgcttgtctt 57600 

tggattg-tg aatttggagt ttctatttgc aggcttattt caactgggca gccttgatcc 5766 0 

gccctgccca gcaatgctac cgttctctcc accgggtctc tgggacccct tcagtcacta 57720 

tacttagctc agttccccac cctcccactc cctaaaagcg taaccaggaa tcctgcctca 57780 

ggtctactgc cgtcttccgt gggctgtttc agttcctatt acccagagtc aaactcccag 57840 

cattccctac ctgattccag acttggagtc cagagcttta acctcttcag gccaactccc 57900 

cactttgcat ttctgtccct atatcttagt ccatggagat acatttcatg tctttgagtc 57960 

tacttacaaa gtaaattttg ctgtttttta attttttttt tgagatggag tcttgccctg 58020 

tcacccaggc tgtggtgcaa tgacgccatc tcggctcact gcaacctccg cctcctgggt 5808 0 

tcaagcgatt catctgcctc agcctcccaa gtagctgtga ttacagacag gcaccaccac 5814 0 

gcccagctaa ttttttttat cttttagtag agacagggtt tcaccatgtt ggccaggctg 5 820 0 

gtcttgaatt cctgacctcg tgatctgccc atctcggcct cccaaagtgc tgagattaca 5 826 0 

ggcgtgagcc actgtgccca gccaattttg ctttttttat atttcattgc tatatgttta 5832 0 

gaggataagt ttacagtgct atatgcattc ccaaatatta gaccaaaaaa atctccaaaa 58380 

aattagaaag aaaatccaaa aaatctcaaa aaataccaaa aagcaacaat ctcacagacc 5844 0 

atactcactg acccccaata aaataaaatt agaaattaac cacaacttaa caaaataaag 58500 

tactcaagtc agagaggaaa gaggaaataa acatcaaaat tacaaagtct aggcggtggc 58560 

tcacgcctgt aatcccagca ctttgggagg ccaaggcggg cagatcacaa ggtcaggaat 58620 

tcgagaccag cctggccaat atggtgaaac cccgtttcca ctaaaaatac aaaaattagc 58680 

caggcatagt gatgtgtgcc tgtaatccag ccacttggga ggctgaggca ggagaatcac 58740 

tgaacccagg gagacgaaga ttgcagtgag ccaaaatcgt gccactgcac ttcggcctgg 58800 

gtgacaaagc gagactccat ctcaaaaaaa aaaaaattac aaactcttta gatagaaatt 58860 

ttggtgtttt tttttgagac ggagtctcac tctgtcgcag aggctggagt gcagtgggac 58920 

tatgtcagct caccgcaacc tccatctcct ggattcaagc aattctcctg tctcagcctc 58980 

ccaagtagct aggattacag gcgcccacca ccagacccag ctagttttta tatttttagt 5 904 0 

agagatggtg tttcaccatg ttggccaggc tggtctcaaa ctcctgacct caagtgatcc 59100 

acctgcttca gcctcccaaa gtgctcagat tacaggcgtg agccaccgca ccccacctag 59160 

atagaaattt caacatgagg ccgggcacaa tggctcacgc ctgtaatctc agcacttcag 59220 

gaggctgagg cgtgggagga tcacttgggc ccaggagttc aggaccagca tgggtgacag 59280 

agacagaccc tgtctctatt tatttgaaaa aaaaaaaaaa aaagagagag agaaagaaat 59340 

ttcaacatga aaagtatctc tcaaaccctt cgagatgttg gcaaaaagcg actcaaagga 59400 

aaatgtatta ctgtgtgtga atttgcttga aaataagaaa gaggccgggt gtggtggcta 59460 

acacctgtaa tcccaacact ctgggagtcc gaatcaagtg gatcatgagg tcaggagatc 59520 

gagaccatcc tggctaacat ggtgaaaccc tgtctctact aaaaatacaa aaaattagct 59580 

aggcgcggtg gctcatgcct gtaatcccag cactttggga ggctgaggca ggtggatcac 59640 

ctgaggtcag gggtttgaga ccagcctggc ctacatggtg aaacctcgtc tcttctacaa 59700 

atacaaaaat tagctgggcg tggtggtggg tgcctgtaat cccagctact cagaggctga 59760 

ggcaggagaa tcgcttgaac ccgggaggcg gaggttgcgg tgagccgaga tcgcaccact 59820 

acactccagc ctgggcaaca gcctgggtga cacagtgaga ctccatctca aaaaatacaa 59880 



: NSDOCID <WC- 01 279 57 A? IA;. 



WO 01/027857 PCT/US00/28413 



33/122 



aaaat tagct 

ggagaatgga 

cccagcccgg 

gaaccctgat 

taattaataa 

ttgagatgga 

tgcaacctct 

actacaggtg 

caccatattt 

t ctcaaagtg 

ttct gtgeaa 

aact caatgt 

tt t^rt cccc 

tgrtt tctct 

cctcc:ccat 

agcat tt aag 

tgccccautt 

gccag iretc 

a get ccrctg 

at at at 1 1 aq 

tactat ccta 

get oar t rca 

gttuct c:\it 

ttcattc.ric 

tttct C.1{1CC 

ggag-Mtcat 

tctacaaaaa 

gaggct.iagg 

gcaccactgc 

aactt ctctc 

cctttgccgg 

tctttttttt 

aggtctcagc 

ccaagt^grc 

agagacaggg 

ccgcctcngc 

tacagtcttc 

atggccttaa 

eggagtcteg 

ctccacctgc 

ggcacccgcc 

atgttagcca 

agtgctggga 

cttgctgtgt 

tatctcaaat 

cctgcagtct 

attcttgact 

caaatccagt 

ctcacctgaa 

cttagcatag 

tccctgctct 

gagtctttcc 

tcactccact 

ctttagaacc 

ggctctctcc 

ggceggacat 

tcacctgagg 

acaaatacaa 

gctgaggcag 

cgccacaaca 



gggtgtggtg 

gtgaacctgg 
gegacagage 
aataaagaaa 
aggcagaagt 
gtcttgetet 
gcctcccggg 
cgcgccacct 
gttaggctgg 
ctgggattac 
aaggtcaata 
gtctggagaa 
aaaaatccta 
ccagaaaagc 
ccttagcctc 
agtgaacctc 
ctgcgtcctc 
ccctggagct 
ctcccttgta 
tgatgtttct 
tttgtttctt 
tttatcttct 
gaaatgacct 
ctttcagcag 

aggcgtgatg 

gagageccag 
ctaaaaagta 
cagtaggatg 
actccagcct 
agcatattcc 
ttcttcctca 
tttttttgag 
tcatgcaacc 
aggactacag 
ttttactata 
ctcccaaagt 
agacggcctc 
ataccategg 
ctcagtcccc 
caagttcaca 
accacgcctg 
ggatggtctc 
ttataggtgt 

gggagttctc 

gggcaatatg 
ccaccatctt 
cttctctatt 
tagctctcat 
tcactgcagc 
tctccacaga 
gctcaaaacc 
agtgacctac 
ccagctctgc 
tttgtatttg 
tgcacttcct 
ggtggctcac 
tcaggagttc 
atagtageca 
gagaatcget 
ccccagcctg 



gcctgcgcct 

gaggaggagc 

aagactcttg 

ccaaatgttc 

taaagggagg 

gtcacccagg 

ttcaagcaat 

ggcccagcta 

tctcaaactc 

aggcaggegc. 

aaaagagcaa 

aaaacaatct 

ctatgttgct 

tattcagaca 

agetgetgae 

cgcctccccg 

tcctctcacc 

ctggatccac 

ccatcaatcc 

cccatgtggt 

tccattctct 

cccgttctct 

ctgcactgcc 

catttgacct 

gctcacacct 

gagttcaaga 

gccagtgtga 

acttgagect 

gagtgacagc 

tctgattctc 

tcctcctgat 

acgcagtctc 

tctgcctcct 

gcacatgcca 

ttggccacgc 

gctgagatta 

tctacctata 

tagactgatg 

caggctggag 

ccattctcct 

gctaattttt 

gatctcctga 

gagccaccgt 

ctcagaactc 

ctcaaaagtc 

aatgtccaat 

acacacccta 

catctcccct 

attctcctca 

gcagtcagag 

ctgtcgtgat 

atgatctgee 

agctgtcctt 

ctgtcccctc 

tcctgaccac 

gectgtaate 

gagaccagcc 

ggtgtagtgg 

tgaacccaga 

ggtgacagag 



gtagtcccag 
ttgcagtgag 
tctcaaaaaa 
aactctcaaa 
atgataaagc 
ctggagtgca 
tctcctgcct 
atttttgtat 
ctgatctcag 
caccgcgcct 
acgtttacaa 
cgcttcagaa 
gttgaccatt 
ttctcctctt 
ctcacttcta 
caegggcaaa 
atggatggac 
cacctgcagc 
ctcccctcac 
aaaatcactt 
gcaaaacttc 
gctgagtcct 
acatccaatg 
ggccgatcac 
gtaatcccaa 
tcagcctggg 
tggcatgeae 
gggaaatcaa 
gagaccctgt 
ctgctgcttc 
ctcttgacct 
gtctgtcacc 
gggttcaagc 
ccatgcccag 
tggtctcaaa 
caggcatgag 
cttgctcccc 
actcccatat 
tgcagtggcg 
acctcagcct 
ttgtattttt 
cctcgtgatc 
gcccagccga 
catactcata 
aattcctact 
ctaacattag 
tccaatcttt 
gttaccccct 
ctggtctctt 
ggatcctttt 
tecegtttta 
tattatcacc 
tctgtttcct 
tgtctggaat 
catgtttaaa 
ccagcacttt 
tggecaacat 
cacacacctg 
aggcagagga 
caagacccca 



ctacccggga 

ccgagatccc 

aagaaaaaaa 

gctcggacac 

aatttttttt 

gtgatgegat 

cagcctcctg 

ttttattaga 

gtaatctgee 

ggectaaage 

actggageca 

ttcatgatta 

ctctctcttt 

tcctcaaacc 

atcattgaga 

accacccacc 

ggtccaggct 

ttctcaggca 

tgggtcactc 

agcctctctc 

tcaaagcatt 

tcccacagac 

gtgaatgttc 

tccctcttct 

cactttggga 

caacatggca 

ctgtagtccc 

ggctgcagtg 

ctcaaaaaga 

tgtctgeaca 

tgaagtgccc 

caagctggag 

gattctcctg 

caaattgttg 

ctcctgaact 

ccaccacacc 

tcataaactc 

ttctcttttt 

cgatctcggc 

ctccagtagc 

agtagagatg 

cgcccatctc 

tgactcccat 

aatccaactc 

tttctcccta 

gaggcaaaaa 

ctgcagatcc 

ggtccaggcc 

tggttctgtt 

aaagtgtaat 

atctgtcaga 

tcccacttct 

gaacagccca 

gtttttccag 

aatcactcaa 

gggaggccaa 

ggtgaaactt 

taatctcagc 

ggtgcagtga 

tctcaaaaaa 



ggctgaggca 

accactgcac 

aaggaaaaaa 

tttaaagaaa 

gttggttttt 

cttggctcac 

agtagctggt 

gacggggttt 

cacctcggcc 

aaaatattgg 

gcacccattc 

cgcagccctt 

ctctctctct 

tccaacactt 

aaccaggaga 

cacagaattg 

ccgagccaaa 

gggccccagc 

ccaacaatat 

ctcccccagc 

gtgtctatgt 

tctcacccca 

agttcttaat 

taaaaatact 

ggccaaggcg 

agaccctatc 

atctacttag 

agecatgatt 

caaaatagga 

gattcagtct 

cagagtacag 

tgcaatggcg 

cctcagcctc 

tatttttagt 

cgtgaaccac 

cggcccagag 

ctcctgcctc 

tttttggaga 

teactgeaag 

tgggactaca 

gggtttcacc 

ggcctcccaa 

atttctatct 

tcataaatag 

aacttgettt 

ctttgaagtc 

agtcgacccc 

atcttcctct 

ttcactccac 

tcccatcctg 

ttaaaageca 

ttccccttgc 

gattttgett 

gaagtcacct 

acacacttca 

ggtgggtgga 

cgtctctact 

tactcaggag 

gecaagatea 

aaaaaaagaa 



59940 

60000 

60060 

60120 

60180 

60240 

60300 

60360 

60420 

60480 

60540 

60600 

60660 

60720 

60780 

60840 

60900 

60960 

61020 

61080 

61140 

61200 

61260 

61320 

61380 

61440 

61500 

61560 

61620 

61680 

61740 

61800 

61860 

61920 

61980 

62040 

62100 

62160 

62220 

62280 

62340 . 

62400 

62460 

62520 

62580 

62640 

62700 

62760 

62820 

62880 

62940 

63000 

63060 

63120 

63180 

63240 

63300 

63360 

63420 

63480 



=DOCID- <WO 0127P57A3 IA> 



WO 01/027857 PCT/USOO/28413 



34/122 

aaaaaaatca cacaaacaca cttctcttca tattcctttt ccaagtttta tttttctcca 63540 

gaatacttta cattgtttta atggaagttc tccgtttccc cccaactaga atggatactt 63600 

cctgcaggta ggcactctag tcctcccatc caagtactaa ccaggctcaa ccctgcttag 63660 

cttctgagag caggggagat caggcctgtt cagggtggta tggcccagga attttgattc 6372 0 

tgttttattc attgctgttc tgttgattct cttttgttcc tcctcctagt gctgagaaca 63780 

ctacttgtac ataataagca ttcaataaat atttgttgaa tgaatgactt gttgaatgaa 63 84 0 

ttaatctcag aaatgcagga ctggttctac attagaaaat ttttcaaggt cattctctgt 63900 

tgtcgtaaca cattaagaga ggaaaatttt gtactctaaa tcatttgata aaatacatac 6396 0 

tgatttctgt tttcaaaaac tcttagtggc tgggcgaggt ggctcacatc tataatccca 64020 

gcattttggg aggacgaggt gggcggatca cttgaggtca ggagtttgag accagcctgg 64080 

ccatcatggt gaaaccctat ctctactgaa aatagaaaaa ttagccgggt gtggtggcgc 6414 0 

atgcctgtag tcccagctac ctgggaggct gaggcaggag aatggcttga acccgggagg 64200 

cggaggttgc agtgagccaa gatcatgcca ttgcactcca gcctgggtaa cagagtgaga 64260 

ctccatctca aaagaaaact cttagtgagt ttaggaatcc aaggaagacc ctcaaactaa 64320 

atagataatc tagctaccag aagccttcag taaaccttaa cactccatgg tgaaacatta 64380 

gaaacattcc tactaaaaga caggctaaga atgcctgcaa tcttcacggc tagtccaaga 6444 0 

agtcaaaaag aagaaatgag cgctgattta aaaaaataaa caaacaaaaa actaccgatg 64500 

cagaggctgg cagcaaggac tgaaggactg tacagtactt gcctggagca ggcggatggc 64560 

cacacccctg cgaagcctgc tcagctggct gggggacgct ccagtgtgtg agtggcagga 64 620 

tgcagggtac ttcctctgcc agggagttgc actggggaga tcctccccca ctcacacttt 64680 

ggcagctggg gctttggaat gtgacttagc ttctgtcaaa gggtcaatcc accctttgat 64740 

atatgatgca aaggcgaaca tatgatgcaa aggtgagaga acagcccaaa ttaggacttt 64 800 

taccacagct gtggaggtgg acagcgacag tggtgggccc tggccagact tttcatgctc 64860 

aaaggtggtg gttgttcttc ctacttcttg tccctccagg gcttcctttg cctgtgtgct 64920 

gaacctgctt cttttaattt tttttaactt ttttaaattt ttaattgttt taattaaaac 64980 

aaattttgaa aactgtctga acctgctttt gaaccctgct atgatttgaa tgtttgtccc 65040 

ctgccaaact gattttgaaa cttaatctcc aaagtggcaa tattgagatg gggctttaag 65100 

cagtgactgg atcatgagag ctctgacctc atgagtggat taatggatta atgagttgtc 65160 

atgggagtgg catcagtggc tttataagag gaagaattaa gacctgagct agcatggtcg 65220 

ccccttcacc atttgatatc ttacactgcc taggggctct gcagagagtc cccaccaaca 652 80 

agaaggctct caccagatac agctcctcaa ccttgtactt ctcagcctct gtaactgtaa 65340 

gaaataaatg ccttttcttt atgaattacc cagtttcaga tattctgtta taaacaatag 65400 

aaaacgaact aaggcaaact ctcatgattc tactgccatg ccattccaat aaactccctt 65460 

tatgcttaag agagccagag ttggccaggc gtggtgactc acgcctgtaa ttccagcact 6552 0 

ttgggaggcc gaggcaggtg gatcacaagg tcaggagatc gagaccatcc tggctaacac 65580 

ggtgaaaccc cgtctctact aaaaatacaa aaaaattagc tgggcgtggt agtgggtgcc 65640 

tgtagtccca gctactcggg aggctgaagc aggaggagaa tggcgtggac ccaggaggcg 6570 0 

gagcttgcag tgagtcgaga tcgtgccact gcactccagc ctgggtgaca gaatgagact 65760 

ccgtctcaaa aaaaaagaga gccagagttt atttctgttg cttgcaacca agaaatctgg 65820 

ctggtgcact gaagtttcca taaataatag caatttaaag actctttcca agccaggcaa 65880 

tgcctagcct tgtgtagtcc ttgtggtaat acattcattc attcatttgt tcaaccaact 6594 0 

gtgctccaga gactaagaat acaaaaatgg gggccgggtg tggtggctca cacctataat 66000 

cctagcactt tgggaggccg aggcaggtag atcacctgag gtcaggagtt cgagaccaac 66060 

ctggccaaaa tggtgaaacc cctactctac taaaaataca aaaaattagc tgggggtggt 6612 0 

99cggacacc tgtaatccca gctactcgtg agactgaggc aggagaatca cttgaacccg 66180 

ggaggcagag gttgcagtga gccgagatcg caccactgca ctccagcctg ggcaacaaga 66240 

gcgaaactcc acctcgaaaa aaaaaaaaaa aaaaaaagag ggccggggct gggcgcagtg 66300 

gctcacgcct gtaatcccag cactctggga ggccaaggca ggagaattac gaggtcagca 66360 

gatcgagacc agcctgacca acatggtgaa accccatctc tactaaaaat acaaaaatta 6642 0 

tccgggcgtg gtggcgcaca cctctagtcc cagctacttg ggaggctgag gcaggagaat 66480 

cgcttgaacc cgggaggcag aggttgcagt gagccgaaat catgccactg cactccagcc 6654 0 

tgggtgacag agtgagactc cgtctcaaaa aaaaaataaa aaaaaaaaaa gaattcaaaa 66600 

attgtagagt tatagtgtgc ttctagttta gttgagagga catctgtcct tcaaggaagg 66660 

ctagaatcta taccctgagt ccttactgaa atcaatccag cagtcaaaac atgggaccaa 66720 

cgatcacagc agtaagatag gaagagcacc tttgtacatt tagctcatgt tgagataagc 66780 

cactgacaga gctgaaggaa gctcacagtt ctgggttcca tcctttggca tttaaaaaga 66840 

aaagtgctaa gaaaattcgg ttggtcacgg tggctcacgc ctgtaatccc aacactttga 66900 

gaggccaagg caggcagatc acgaggtcag gagttcgaaa ccagcctggc caacatggtg 66960 

aaaccccgtc tctactaaaa acagaaaaat tagccgggca tggtggcgca tgcctataat 67020 

cccagctact caggaggctg aggcaggaga attgcttgaa cccgggaggg ggaggttgca 67080 
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gcgagtgaga gcaggccact gcactccagc ctgggagaca gagcaagact ctgtctcaaa 6714 0 

aaaaaaaaag aaaaaaagaa agaaaggaaa aaaagaaaga aaaaaaaaga aaaaagaaaa 67200 

ttcaggccag gccaggcctg gtggctcaca cctgtaatcc caacactttg ggaggctgaa 67260 

gcgagacggt gccttagccc aggagtttga gaccagcctg agcaacatag cgagaccctg 67320 

tctctataaa aaaaaatttt tttttggcca gacgcagtgg ctcacgcctg taatcccagc 67380 

actttgggag gccgaggcag gtggatcacg aggtcaggag atggagacca tcctggctaa 6744 0 

cacggtgaaa ccccatctct actaaaaaat acaaaaaatt aaccgggcgt ggtggcgggc 6750 0 

gcctgtagtc ccagctactc gggaggctga ggcaggagaa tggcgtgaac ccgggaggcg 67560 

gagcttgcag tgagccgaga ttgcgccact gcactccaga ctgggagaga gtgagactcc 6762 0 

gtctcaaaaa aaaaaaaaaa aaaaaaaaat taattgtcag gtgtgctggc atgcagctgt 676 80 

agtcctagct actcgggagg ctgaggtaag aagatcgctt gagcccagga gttcaaggct 6774 0 

gcagtaatag tgcctctcac tctaccctgg gtgacaatga gaccctctct caaaaagaaa 67800 

gaaaaaaggg aaagaagaaa agaaagaaag aaagagaaga aaggaaggaa gaaagaaaga 67860 

aaaagaaaag gaaggaagga agaagaaaaa aaaagaaaga aagaaaagag agagaagttc 67920 

aaagaccaaa gggtcaggat cccaaaatag tttttatgtt ttatttattt atttacttat 67980 

ttatttttga gacagtatgg ctctgtcgcc caggctggag tgcagtgatg cgattgcggc 68040 

tcactgcagc ctccaaactg ggctcaggtg gccctcccac ctcagcctcc cgagtagctg 68100 

ggaccacagg cgcgtgccac catgcccagc taatttttta attctttgta gagatgaggt 68160 

ctctatatgc tgcccaggct ggtctcgagc tcctgggctt aagccatcca cccgcctggg 68220 

cctcccaaag tgctgggatt acagaagtga gccaccgcgc ctaatcgggt ggtttgtttg 68280 

tttattgacg gggtctcgct gctgcccagg ctggagtgcc agtggctgtt cacaggtgca 6834 0 

gtcctggagc attgcatcag ctcttgggct ctagcgatcc tccagagtag ctgcagctgg 684 00 

gattccaggc gcgccaccgc gcggggctca gaatgggttt ttatattgag ggttatgctg 68460 

ccacctagag gatatatgta gtaccgaact gtgtgcgcag ggaggctgag gttgcagtga 6 8520 

gccaagatga tgccagggca ctccagcgtg ggtgacagag caagatttca tctcaaaaaa 68580 

aaaaaaaaaa aaaaaaaaaa aagaattgaa agtaaggtct tgaagagata tttgtgcctg 68640 

tatggtcata gcagtattaa ctttgaccca ctagctaaaa cacaaaagca acatgtgtct 6 870 0 

gtcagcaggt gaacggataa acaaaatgtg gtatatatgt acaattgaat attattcagc 68760 

ctttaaaaag gaataaaagg ctggatgcgg gggctcacgc ctgtaatcct aacactttgg 68820 

gagactgagg tgggtggatc acccgaggtt aggagtttga gaacagcctg gccaacatgg 68880 

tgaaacttca tctctactaa aaatactaaa attagccggg catggtggca cttgtctgta 68940 

atccaagcta ctggggaggc taaggcagga gaattgcttg aactcaggag ccggaggttg 69000 

cagtgagcta agatggcacc actgcactcc agcctgggca acagagtgag actccatctc 69060 

aaaacaaaca aacaaaaaat tattatttcc aaagaaacaa gaccctgggt ccatttccca 6912 0 

gcccacacct gatgttgact cacaacacac agcctggttt gctatgagcc tgcttcattt 6918 0 

aattgtcacc ttaacttcac atcaccctca agtcctggaa taactctttg ctgacctttg 69240 

tgtgctgagc catctccatg tcgctcaacg tgcagtccct ctcactgcac tgagtcaata 69300 

gccagacgtg gtctgactgc agggtcatcc ttggtggctt aggctgactc gggcatagca 69360 

Sggtgctctg agacctcacc gcatataggc tttgccccca ataaactcta tataatattc 69420 

atattatgtg gtctgggtgt gtgtagcttt gcactgtctt ctcgtgacag tgccctcaac 69480 

ctctttccca ggatttcctc ctctacctcc tcaagtccca ctgctctgca aagaccaaaa 69540 

gctgcagagt cccagctccc tcctttacac cccacgacgc agcctcctct ctcagaaccc 69600 

tttaaacaga gtcttttact gcagatccca agaacagcca cacccctctc tcccacccac 69660 

tccagacaca cccaggtaat tatagcaccc agggtaacta tgtagatgga gtccctggaa 69720 

catgtggata gtgccccctg ggagtatgca aaagcaacat tgctggcacc tgcagagaac 69780 

agggtgacat ccaggaatca gagcatgggc ctctgggagg tagggatgtg gccaggcagg 69840 

ctgccaaaaa ttggtagagc aaggccacag gatctttctg accttccttc caaacagagg 69900 

ctcctgtact ggtgatccct gtgttgattg accactccct tcctgggggt cgtggtctct 69960 

gtcccagttg cccggacttc tgtgagtgtc ctactgaggt ccttttcatg agaagcatgc 70020 

tgtccttcca cctgctggga gcaagagtga caacttcaat actataatag cagtggcata 70080 

cagagaagaa gaaagatgaa gtggcaagaa aaacaggctt ccaagcagga gtttttctat 70140 

aaaaacaaaa acgtttacaa gcaaactttt tataaagggc tagatagtaa atattttagg 70200 

ctttgagagc cacatagact tgtttgcagg gactcaatgt cgctattgta gtttgaaagc 70260 

agccatcagg gttatgtaaa tgagtgagtc tgattttgtt tcagcaaaat tttatttacc 70320 

aaaacagaca atgagtgggc tggatttggc ccatgatcct tagtttgcca actcctgctt 703 80 

tgggctcacc cagatctgat tttgaattct ggctctgcta ctggttagct gcaggagctt 70440 

ggaaggctct ctgagcctgt ttcctcatct gtaaaattaa agcaataatt tctaacactc 70500 

aagagtgtta cctcacgcct gtaatcccag cactttggag gctgaggcag gcggatcacc 70560 

tgaggtcaga agttcaagac cagcgtggcc aacgtggcaa aaccctgtct ctactaaaaa 70620 

atacaaaaag tagccgggca tggtggcgcg catctgtaat cccagctact tgggaggctg 70680 
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aggcagggat 
cacaccccag 
ttagaaggtt 
gatagctgca 
t tgtcaggga 
grjcattggat 
gagtctcttg 
c:ctgcctcc 
tgccaccaca 
catgccggtc 
ggaataacag 
crttttacta 
c««t t cacaaa 
gcaacatgac 
t cjacaagga 
at Lit cat tga 
gaag^ia^aea 
aa^iaca .it 1 1 
acccaqat tg 
ctgttg^agg 
at at t at tgc 
tctqat cent; 
ttttttqttt 

a:ctt ; n:tc 
gaagt cirictg 
gaggt 3gt 
cccgcrt egg 
ctagctt tct 
ataaaagacc 
actcct qutg 
accaagaaag 
cagcaegtta 
tttgggaggc 
agact t agaa 
etcget: tccc 
atgagaecc t 
ctaaaagagg 
atcaggccca 
tagtcaaagt 
ttgagtggtt 
agaaggctgg 
taaagatgga 
cctgagcaag 
ctagccctgg 
agctgttgga 
gagggecttt 
taacagatgg 
ttctcaacat 
gggtcctgat 
cgagtccacc 
ccatagagca 
tgctaaacaa 
actgatggat 
ttgtaaactg 
tataatgagc 
tttggccggc 
ctcctgccga 
cattttacca 
attactatgc 
ctgtgacagg 



actgetagaa 
cctggccgac 
ttgagataat 
ttggtctaat 
ggacttccta 
agaggagttg 
ctctgtcacc 
caggttcaag 
cccggctaat 
tcgaactcct 
atgtgagcca 
getgeaagat 
atgaggcaaa 
atttaatgaa 
tatcatttta 
aacgeattaa 
agaggactga 
caagttaaat 
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gattaagttt 
agtgatggtc 
aaagtgaggg 
ttgtttttta 
actgcaacct 
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tgatcactga 
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gttattctgg 
aagatatggc 
caaggtgggc 
gcaaccagct 
tcatttctca 
catctcaggc 
attgggattt 
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acattgacct 
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aaagtcagag 
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ctgggggtgg 
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actattgagg 
ccagacccca 
cagtaaagtg 
gcgacatggg 
agggtggatt 
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tcatggccct 
agtgaggacg 
ttctttatca 
cctcctatct 
tggagtcget 
aagacctgag 
ctgggtggtg 



cc tgggaggt 
agagegagae 
gaataaaaga 
tataacagtt 
tcaggaggta 
agagaacacc 
caggctggag 
cgattctcct 
ttttatattt 
gacttcaggt 
ccgcacccag 
ctcaggcaat 
taataatatc 
atgagaagtc 
ggtggatact 
aattcatttt 
ttataatget 
tttaggctct 
agcccatctg 
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attattgttg 
aattgcatga 
gacagagtct 
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tggtcaggct 
tgctgggatt 
ttctagggtt 
tcccaatatg 
gactcaactg 
cattccattt 
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aaggcttaag 
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gagtgttctt 
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tagagtaact 
tcttttagca 
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ctactgtaag 
ttccaagtac 
ccaggcatgc 
ggcgcggtgg 
aggtcaggag 
caaaaacatt 
ggcaggagaa 
gcactccagc 
tagctgggca 
atcgcttgaa 
cctggtgaca 
aaagagcaca 
gaaatagaaa 
tacngaatag 
aagaatcctt 
ccaaggtgga 
acctcgtctc 
tcccaggtac 
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caaaaaaaaa 
9 J 99~tgagg 
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dtctcagcta 
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aaaaaaaaaa 
ccatgttgac 
tcaagacaca 
acactggtga 
aaagaggggt 
atcattcctg 
caatgtgatc 
tagagactag 
agactttgga 
acccattctg 
ctttgcccta 
ctctgtcacc 
ccggttcaag 
accacacccg 
tggtctcgat 
catgtgtgag 
tagtaaatta 
attatcaaac 
gggtgctcat 
ttattaccgc 
cttgcatcaa 
tattcttttt 
tttgttattt 
gatacaggca 
gcaatttatc 
acatttaatt 
tttttgagac 
tcacttcaat 
tgggattaca 
ggttttcgcc 
tcagcctccc 
attctttagt 
gttgtcagct 
cggattccag 
ccaccacacg 
ggctggtctc 



atatgacaag 
caccccccgc 
tacggcccca 
ctcatgcctg 
atcgagacca 
agccaggcgt 
tggcgtgaac 
ctgggtgaca 
tggtggcgcg 
cccagtaggc 
gagtgagact 
gacagagtca 
gtatatttta 
caacgtgtgg 
gatccgtccg 
aggatcactt 
tactaataat 
ttgggaggct 
atgccatgcc 
aaaaaaattg 
ggggtggatc 
tctctactaa 
ctcaggaggc 
agatcgcgcc 
aaaaaaatac 
gacatcatgc 
tacatgccag 
catttttaaa 
ctggcctttg 
atagaagtat 
tgggttgggg 
aatcaaccac 
cacaagggct 
actccatagg 
tgcatttttc 
caggctggag 
cgattttcct 
gctaattttt 
ctcctgacct 
ccaccgcgcc 
taaccgtgaa 
ctaaggatag 
gtgtgtacca 
ttacactcaa 
agactttcta 
tttttttaaa 
ttgtgggtac 
tgcaatgtga 
cttcaagtta 
ttgtattgac 
agagtctcac 
ctctgcctcc 
ggcacacacc 
atgttggcca 
aaagtgctag 
gaggtctgct 
tgggctggag 
caattctcct 
cggctaattt 
gaactcctga 



gtttctcttc 
ctttcttctc 
gctcacattc 
taatcccagc 
tcctggctaa 
ggttgcaggt 
ctgggaggtg 
gagcgagact 
tgcctgtaat 
ggaagttgca 
ctgtctcaaa 
caggtatttg 
cacttacagc 
ctatttagta 
gcatggtggc 
aaggtcagga 
acaaaaaaaa 
gaggcaggag 
actactgcac 
ttgggcgtgg 
acctgggttc 
aaatacaaaa 
t9 a .9gcagga 
tctgcactcc 
ttgattgtct 
tgattgtaag 
aaggtgagat 
cctgctagat 
tccccagcta 
ttttgttttg 
gctttggatc 
atgggcagac 
tgggtaagct 
gagaggacaa 
ccttgtctga 
tgcagtggaa 
gtctcggcct 
ttgtattttt 
catgttccgc 
cagccccctt 
tataacagct 
ccttggggac 
tgccctctaa 
tgtttattca 
tctcatgtac 
ctttgcacat 
gtagtagata 
aataagcaca 
caaacaatcc 
tagagtcact 
tcagtggccc 
ctggttcaag 
accatgccca 
ggctggtctt 
gattacaggc 
ggtgacaatt 
tgcaatagca 
gcctcagcct 
ttgtattttt 
cctcaggtga 



aaacaggctg 
cttttccttc 
ctttccttat 
actttgggag 
cacggtgaaa 
gcctgcagtc 
gaggttgcaa 
ccgtctcaaa 
accagctact 
gtgagccgag 
aaaaaaaaaa 
cagtaggaag 
acatcttcgt 
ttcactaaaa 
tcacgccttt 
gttcgagacc 
ttagccgggc 
aatagcttga 
tccagcctgg 
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attagctggg 
gaatttcttg 
atcctgggtg 
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ctggacataa 
actaggggcc 
aggtggcatc 
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ttcctggttg 
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aaaaaaaaaa 
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agaaaaaaaa 
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tcttggacag 
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atggtggtgc 
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gcaacagagt 
ctgtaatccc 
gaccagcctg 
cgtggtggtg 
aacccaggag 
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caagtgttcc 
ctaagattca 
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tctctttaaa 
ttgggccagc 
agtgtgacct 
acatgatgga 
gcaatgctct 
tcatttggta 
tattatgaga 
tcactgcaac 

tgggactaca 

gggtttcacg 
ctctcaaagt 
taaagtgtat 
tttgtgagca 
gcagttggtg 
aattaacttt 
cataccactt 
ttgaagtaaa 
attttttatt 
tggagtacat 
tggggtatcc 
tttaagttat 
atcaaatata 
tgcagtggca 
gcctcagcct 
atattttttt 
cctcaaatga 
acacctggcc 
ttttgagact 
tcactgcaac 
tgagagatta 

ggggttcacc 

ttggcctccc 



tattctctaa 
tactacatgc 
actggggctg 
gcggatcatg 
actaaaaatg 
a 9gaggctga 
ttgtgccact 
aaaaaaaaaa 
aggcaagaga 
tgcactccag 
agacagaaag 
agagtgcacg 
acatttaaaa 
tgcaagtcta 
ctttgggagg 
acatggtgaa 
atgcctgtaa 
cgctgcagtg 
gagactgtct 
agcactttgg 
gccaacatgg 
ggcacctgaa 
gcagaggttg 
actatgtctc 
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gtggcctgcc 
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gattacaagc atgagccacc acgcacagcc aattttttcc gtttttgtct gaaatcttat 7794 0 

tttgtgtcat ctttgaaata tatttttgat ggatataaaa ttgttggttg atagttatta 78000 

tcattattat .tattattttg agacagggtc tcactctgtt gcctatgctg gggtgtagta 78060 

atgtgatctc ggttcactgc agacttgacc tcctagggct caggtgatct tcccacctca 78120 

gcctccctag tagctgggac tacagatgca tgccaccata cccaactaat ttttctattt 78180 

tttgtagaga tgaggctttg ccacatttcc caggctggtc tctaactcct gagctctagc 78240 

aatccaccca ccttggcctt acaaagtgct gggccatgac tagccagcag ttacttttta 783 00 

tagcatattg aatatttaat atgaatcttc tggcatccac tgtaactgtt taaaaaatca 78360 

gctgtttact tggcactctt tttttttttt ttttttttga gacagagtct tgccctgtcg 7842 0 

cccaggctgg agtgcagtgg cgtgatcttg gctcactgca agctctgcct cccgggttca 784 80 

cgccattctc ctgcctcagc ctccggagta gctgggacta aaggcgcccg ccaccacgcc 7 854 0 

cggctgattt ttttgtattt ttcgtagagt tggggtttca ccgtgttagc caggatggtc 786 0 0 

tcgatctcct gacctcgtga tctgtccgcc tcggcctccc aaagtgctgg gattataggc 78660 

grgagccacc gcgcccagcc tctttttttt ttttttttag acggagtctt actctgtcat 78720 

ctaggctggt gtacagtggc gtgatctcag ctcagtgcaa cctccacctc ctgcctcagc 78780 

ctgccaaata gctgggatta caggtgcgta ccatcacgcc cggctaattt ttgtattttc 7884 0 

agtagagatg gggtttcacc atgttagaca ggctggtctc gaactcctgg cctcaagtga 78900 

tctgcctgcc ccagcctccc aaagattaca ggcatgagcc accgcacccg gccaagtagc 78960 

actcctttga aggtaatctg cttcccctac ccctagcaat ttttaacaat ttttcttcat 79020 

ttttatttcc tgaagttttg ttattaataa tctgtgtgca gatttctttg tatttctttt 79080 

gtttgcagtt catagtgatt cttgaattag tgtgttggtt tctgttatca ccacaggaaa 7914 0 

attgtcagcc gttagctttt caaatatttc cttgctaaat tctctcttct cccctttcgg 79200 

tacaattgat ttgattaaaa ctaaaaccag ggccgggtgc agtgactcat gcctgtaatc 79260 

ccaacacttt gagaggctga ggcaggtgga tcacctaagc tcaggagttc aagaccagcc 79320 

tggccaatat ggtgaaaccc cgtctctact aaaaatacaa aaattaccag gcatggtggc 793 8 0 

acacatttgt agtcaggagg ctgaggcagg agaattgctt gaatccagga ggtggaggtt 7944 0 

gcagtgagct gagatcccac cactgcagtc tggcctgggc gacagagtga gatgagaatc 79500 

tgtctcgaaa aaaaaagtta tgaatgtttg ataaactata tttgttagaa tgtttgttgt 79560 

agaatactat tcattgattt ttaaacaatg ttagattaaa ccattcactg gatttgtgat 79620 

aattaactta ctgattttac ctcactgatt tgttgtaatt aatacaactg gtataaaaag 79680 

actgtgacga ggccgggcat ggtggctccc gcctataatc ccagcacttt gggaggctga 79740 

ggcaggcgga tcacctgagg tcaggagttc aagaccagcc tgaccaacat ggtgaaaccc 79800 

catctttact aaaaatacaa aattagccgg tcgtggtggt gcatgcctgt aatcccagct 79860 

cttcgggagg ctgtggcagg agaatcactt gaacccggga ggtggaggtt gcagtgagcc 79920 

gatatcgcgc cattggactc cagcctgggc aacaagagcg aaactccgtc taaaaaaaaa 79980 

aaagaaaaaa aacacataaa acaaaacaac actgtgacgg ttcccaaaaa ttaggagcat 8004 0 

aattaaagga actcctgata aaaattaatt ttatcttaca tgtaaactaa aatgacttta 80100 

tgaagttaat tcagaaatac aatgcagggt attagtttgc cacagctgcg tattcagcct 80160 

aatgtaatat tcttgttatt tttaaattct tcttttaact ttactcatat gtggatcatc 80220 

aaatttcaaa agattaaatg acaatactct tagcagcaag cttccctaag catataaaca 80280 

ttttaatggg tgatgattca gaaggtaccc gaagaatatg tactgccaga tatcattcac 80340 

ccccatatac ctgcccgaca gacatcccat tttgggaccc tggataaatg tgtgggtgga 80400 

gagaaagata ggagaaagtg gtataagcaa atggctttgg agtctgattg acagcgattg 80460 

aaatcctgtc tctacctctt aacagcctca tgatcctaca taagttaccc cgatcctcag 80520 

ggccacatct gtaaattggg ggttgcgatg gcagccatct cacagggtct cttttcgggg 80580 

aagggcagga attatggatt aagtgagcta gtaattgtaa agcacttaat acaaggaggg 80640 

cgcataataa gtacttcata aataatgacg gccattatca tgactgaggt gtatgcagct 8070 0 

gtcggggatt acggcgactt cagaatttct ggtgggcagg gctcaaaggc agcaaatcac 80760 

actggaagtc gaggtgaggc actgcttctg cacagactgc ttagctggag agaatgagga 80820 

aggcttagag gagatttaga ggaacttaga gtcctccgcc tccaactctg tgggatctgc 80880 

tcccgtgcca gagacattca ggggatttct cgcactctcc cctcccctac gtccctcccg 80940 

ccccatccaa ctaaccacac aacacataca aaatagcccc tgcgaggttc tgcacgctgg 81000 

aagggaacag gagaagggcg ctgcgctttc ttgctgatgc cctgtacttg ggcccctggt 81060 

agacacagcc acttgtcccc tcagcctgca gagaaatccc acgtagaccg cgcccgggtc 81120 

cttggcttca gccaatctcc ctttggtggg ggtgggatgc acgatccaag gttttattgg 81180 

ctacagacag cggggtgtgg tccgccaaga acacagattg gctcccgagg gcatctcgga 8124 0 

tccctggtgg ggcgccgctc agcctcccgg tgcaggcccg gccgaggcca ggaggaagcg 81300 

gccagaccgc gtccattcgg cgccagctca ctccggacgt ccggagcctc tgccagcgct 81360 

gcttccgtcc agtgcgcctg gacgcgctgt ccttaactgg agaaaggctt caccttgaaa 81420 

tccaggcttc atccctagtt agcgtgtgac cttgagcagt tgactttatt tttcagtgcc 81480 
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tagttttcca gataccagga ctgactccaa ggactattac tcatctggag ggtttagcac 81540 

agtaccgtcg catagtaaat ttccatgtca gttttggtta cctttcatgc acttgcaaac 81600 

atgccatgct ctgaaacgaa ataggcacat cttttttttt ttttttttta aggagtcttc 81660 

ctctcgccca ggctggagtg cagtggcgcg atcttggctc actgcaacct ccacctcccg 81720 

tgttcgagat tctcctgcct cagcctcctg attagctggg actacaggca tgccacgacg 81780 

cccagttaat ttttgtattt ttagtagaga cggggtttcg ccatcttggc caggctggtc 8184 0 

taactcctga cctcaggtga tctgactgcc tcagcctctc aaagtgttgg gattacaggc 81900 

ataagccact gcatctggcc agaaatgaaa taagtaaatc ttttaacctg ctctaacaat 81960 

atagtgaaaa gaccatatta ttattagagc aggttaaggg atttgcctat ttcgggttct 82020 

agttatagtc ttaaacttgg acattcttgt agaaagtaaa aagtttcctc ttcaaagttc 82080 

cccttcttgt taaagaatac atcataagtg ttagaagtaa tagtttattt taaagactaa 82140 

ctttcttcaa gcctccttgc tttgtgctaa taactctttg ttaagcccta tcctatgtaa 82200 

ctgttggaca tgctcacagg cacgttccag ttcacagcct atgccccttc cttatttgga 82260 

aatgttattg cttccttaaa cctttcggta agcaacttcc tctccttctt cgttcttcct 82320 

tgcacrtacc tatttagaaa gttttaggct attagcaaat cggctatcag tttaagagtg 823 80 

tgaggtcccg ctccagccaa tggatgcagg acatagcagt gaggacgacc caaatgcgta 82440 

agggataaat atgtttgctt ttcctttgtt caggtgtgct ctcgacatcg ttccatctgc 82500 

gattgagcac cctttctgca gaaagtaaag attgccttgc tggagatctt ttgtctccgt 82560 

gctgactttt crtcgtggca ccgattatct atttctaaca attttggtat ttctaacatt 82620 

ctgaacaatc ttgggctagt tgtctcttct gggcctgttt ccccatccgt cacatgataa 82680 

act teat egg tttaaaaacc ccagcgaaca tttattgagt tactattacc ttcctgccct 82740 

ccccaacccc aaccccaggg agcagttaca acctcagccg ctgagcgcac tegcegggtg 82800 

ttaagaagca ccaaagacag ggaggcttga ttgattttgc tttgggagta gagggtcaga 82860 

agattcacag gaaaatggca tttgagcaag gatgattcac tggagctagc ttttaaatac 82920 

tggegagget tttatgttgc agtcccttac aaagttgagc attegcaggg actgcactcc 82980 

gaaataagee cgcttcccct tttcattege taatgatcca gggagctget ggttccgcat 83040 

gcggcaggtt gtgectttte ctaatcaggg ttctgeateg cctcgaaccc geaggcegtg 83100 

gcgggtcctc ctgaggaagc agggactggg gtgcagggtg aagctgeteg tgccggccag 83160 

cgcctgtgag caaaactcaa aeggaggage aggaggggtc gagctggagc gtggcagggt 8322 0 

tgaccctgcc ttttagaagg gcacaatttg aagggtaccc aggggcegga ageeggggae 83280 

ctaaggcccg ccccgttcca gctgctggga gggctcccgc cccagggagt tagttttgea 83 340 

gagactgggt ctgcagcgct ccaccggggg ccggcgacag acgccacaaa acagctgeag 83400 

gaacggtggc tcgctccagg cacccagggc cegggaaaga ggcgcgggta gcacgcgcgg 83460 

gtcacgtggg egatgeggge gtgcgcccct gcacccgcgg gagggggatg gggaaaaggg 83520 

geggggcegg cgcttgacct cccgtgaagc etagegeggg gaaggacegg aactccgggc 83580 

gggcggcttg ttgataatat ggcggctgga gctgcctggg catcccgagg aggcggtggg 83640 

gcccactccc ggaagaaggg tcccttttcg cgctagtgca gcggcccctc tggacccgga 83700 

agtccgggcc ggttgctgaa tgaggggagc cgggccctcc ccgcgccagt ccccccgcac 83 760 

cctccgtccc gacccgggcc ccgccatgtc cttcttccgg eggaaaggta gctgaggggg 83 820 

cgccggcggg gagtcaggee gggectcagg ggcggcggtg gggcaggtgg gectgegagg 83 880 

gctttcccca aggeggcage aaggecttea gcgagcctcg acctcggcgc agatgccccc 83 940 

tgagtgcctt gctctgctcc gggactcttc tgggagggag aaggtggcct tettgegega 84000 

ggtcagagga gtattgtcgc gctggttcag aagcgattgc taaageccat agaagttcct 84060 

gcctgtttgg ttaagaacag ttcttaggtg ggggttagtt tttttgtgtt tctttgagga 84120 

ccgtggatca agatcaagga aatctcttta gaaccttatt atggaagtct gaagtttcca 84180 

aatgttgagg gttttatgtc taaaagcaac acgtgaaaaa attgttttct tcacccagtg 84240 

ctgtcttcca atttcctctt tggggggagg ggtagttact gctgttacta aaataaaatt 843 00 

acttattget aaagttcccc aacaggaaga ccactacttt tgatgacttt ggcaagtttg 84360 

ctaactactg gaaccctaac ttacaaacga actacttaca tttttgattt ccagttgtat 84420 

tacctgccca atgtttacgt agaaacagct taattttgat tctgggtaac gttgttgcac 844 80 

ttcattaaaa atacatatcc gaagtgagca agtatgggtc tgtggacagc agtgattttt 84540 

cctgtcaatt cctgttgctt cagataaaat gtaccagaca gaggceggge gcggtggctc 84600 

acgcctgtaa tcccagcact ttgggaggct tggcgggtgg atcacctgag ategggagtt 84660 

caagaccagc ctgaccaaca tggagaaacc ccgtgtctac taaaaataca aaattageca 84720 

gggtggtggc geatgectgt aatgecaget acttgggagg ctgaagcagg agaatcgett 84780 

gaacctggga ggcggaggtt gcggtgagcc gagatagcac cattgcactc cagcctgggc 84 840 

aaaaagagcg aaactccgtc tcaaaaaaaa agtaccagac agaaatgggt tttgttttct 84900 

ttttttgttt tgagaeggag tttegctett gttgeccagg etcgagtgea atggcgcgat 84960 

ctcagtctcg gctcactgca acctctgtct cccaggttta atcgattctc ctgcctcagc 85020 

ctcccaagta gctgggatta cccatgcccc accatgcccg gctaattttt gtatttttag 85080 
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tagaaacggg gcttcaccat gttaggctgg tcttgaaccc ctgacctcaa gtgggcctcc 85140 

cacctcggcc tcccaaagtg ccaggattac aggcatgagc caccgcggcc agccagaaat 85200 

gggttttgga aaaagcacta aacaaaatcg aacttggttt catatgacag ctctgctgct 85260 

aactgtaaca ggggcagacc agttaaccta cttttctgtc ttctgtcagc tgagaattag 85320 

atgattccca aaggcccatt gaactctgaa tgactttaaa tacttcttct taagtgggta 85380 

cacggttttg gtaactgatg ccaggtgatg aatgcatgaa agtgcttaat gaatgaaacc 85440 

ggtaaaatag taggaggaag ctttattggt aaggcagggg tatacctaat agctctctaa 85500 

tttattggta ttgaagtggt taacttttgt ttttttaagg ggggaaaaca ttctaagaat 85560 

aatgaggcaa actgcatatt gcacaagaga ctgttgtctc tattcaacaa ataccttttg 85620 

agtgtccaga gtctgccagg tgctgtgcta ggccctcacg attgagtagt gaaccagaga 856 80 

argtccctgc acccatggag cttattgtct actggggtag acagataata aataagcaaa 85740 

caaatcttct ctcttctccc tttcgctcca tgtaagtgtg tgtgtatagg tgtatactta 85800 

caagttgagt aaagtgttat gaaagattaa gaggagaaat gcattttggt tagatgttag 85860 

aggactcagc aggtgacctt gaaacttaga gctgaaggat cagtaggagg taactagaga 85920 

ggccagggaa tcgcatgttc aaaggccagg aggcaagaaa gagcatggtg cccttcaaga 85980 

gaggaaagaa ggctactgtg actggagcat agatgtaggc aagtgttggg tgattgagag 86040 

ctccacgggc catggttagg ttttattcct aatgccgaga tgccaaacat ggtggttcat 86100 

atctgtaatc ccagtatttt aggaggccga ggcaggaata tagcttgaac ccaggagttc 86160 

aagaccagcc tgagcaacat gagacctgta caaaacattt aaaaaattgc tgggtatgat 86220 

qgtgcacacc tgtggtccca gctactcagg aggctgaggc agaaggatca cttgagccta 86280 

99aggtggag gctacaatga gccatatttg agtcactaca ctccagcctg gatgacaaag 86340 

tgagaccatg tgtcaaacaa aatacagaaa gaatattaat ttaaaatttt gaaagaggag 86400 

tgatctgaac ttatatctta aaaagatcat tctagggcat ggtggctcat gcctgtaatc 86460 

aagygctttg ggaggctgag acaggaggat cacctgaggc cagttcgaga tcaacctgta 86520 

cagcatagag agactccatc tctacaaaaa gaaaaaataa atagctgggt gttgtgagtt 86580 

attcaggagg ctgaagcaga aagatcactt gagcccagga gtttgaggct gcagtaagct 86640 

acgatcccac cactgcaaca cagtgagatc ttgtctcaaa aaaaaaaaaa aatcattcta 86700 

gg-gcttttt ggaggctgga tgtggtaaga gtagaagctg gagatggtcc tgttagggat 86760 

tcgattcaga ctttaaatac catcaatgca ttgagtccca aatttacatc actacgttgg 86820 

atccttgccc ctgaatccag actggtatat ccaactttag gttcagtttg tatctctacc 86880 

tgaccaatiat agaggtgtcc agtcttttgg cttccctagg ccacattgga agaagaattg 86940 

tcttgagcca cacatagagt acactaacgc taacaatagc agatgagcta aaaaaaaatc 87000 

gcaaaactta taatgtttta agaaagttta cgaatttgtg ttgggcacat tcagagccat 87060 

cctgggccgc gggatggaca agcttaatcc agtagatacc ttcaacttac aatatctaaa 87120 

attttatgcc agatttagtc attttaaacc tgctcatcag tttttctcaa gaagtagtat 87180 

tttggctttt tttcttttct tttttttgag atggagtttc gctcttatcg ttcaagctgg 87240 

agtgcagtgg cggatcttgg ctcactgcaa cctccgcctc ctgggttcaa gtgattctcc 873 00 

tgcctcagcc tcgcaagtag ctggaattac aggcatgcgc caccatgacc agctaatttt 87360 

tggagacagg gtttcaccat gttggtcagg ctggttttgt actcctgacc tcaggtgatc 87420 

tgcctgcctc ggcctcccaa aggctgggat tacaggcatg agccaccgct cccggctgca 87480 

tttctggatt tttagttgct cagcccaaaa ctttagtaca tctttgaacc tcttctttcc 87540 

tcctactcta tatctgatcc atcagcaaat ctgttaggtc tacctcacac atatcgaaat 87600 

cctaccacgt ctcaccatct gtgacaatta acaccctggt ctaggcagtc atctctgtta 87660 

agattgagtg gttaaggatg tcctctaagg agatgacatt caaatcttag cttaaatgtc 87720 

aagagggagc tggttttata aagattgagg aggcagcatt attttgccat aggcttccat 87780 

ttggtttcca ttccattctt gatacttatg gtatatattc aaaacaaatg cacagaaaca 87840 

gacccaggta tattgggaat ttcggatata gagttcctag ttgggaaaag atagactgat 87900 

ctgtaaatga tgctagttat ccatcatctg gcaaaaaata atttcctgcc tcctctcata 87960 

tatctcagat caacagactt tttctgttaa gggccaaatc ataaatattt taggctttcc 88020 

agaccatatg gtttctgtca cactctcctt tatccttgaa gccatagaca atatgtaaac 88080 

aaatgggcat ggctgtgcta cgataaaact ttacttacaa aaactggtag tgggccagtt 88140 

taggcatggc cagcactttg ggaggctaag gcagatggat cacttggggt caggagtttg 88200 

agaccagcct ggccaacatg gtgaaaccct gtctctacta aaaatacaaa aaatagctgg 88260 

gcatggtggt gggtgtctat aattccagct actctggagg ctaagacaca agaatcactt 88320 

gaacccagga ggcagaggtt gcagtgagct gagatagcac cactgcactc cagccagggt 88380 

gacggagtct taaagcaaaa caaaacaaaa ggtagtgggt tgtatttggc ccatgggctg 88440 

tagtttgcca atccctgatg cagaaacaaa ttccaggtaa ataagagcct ggaatgttaa 88500 

aaaaacaaaa cttgaagtca tgtagaagaa caggtagggg gaacaatcct gatctcagga 88560 

taggaaggga tattgcttaa aataagacac aggaaaatat aatccatgtt gtgtaaattt 88620 

gactacgtta aaacttaaaa ctttcgccaa gcgcggtggc tcacgcctgt aataccagta 88680 
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ctttgggagg ccgaggtgag cagatcacca ggtcaggaga ttgagaccat cctggctaac 88740 

acggtgaaac cccgtctcta ctaaaaatac aaaacattag ccgggcgtgg tggcgggcgc 8 8 800 

ctgtagtccc age tact tgg gaggctgagg caggagaatg gcctgaaccc gggaggcgaa 8 8 860 

gettgeagtg agctgagatc gcgccactgc actccagcct gggegacaga gtgagattcc 88920 

gtctcaaaaa aacaaaacaa aacaaagcaa aaaacctaaa actttcatac aataaagtat 88 98 0 

acctaagata cttctagaag agaagattta catccaggac gtgtatggaa tttctgcaag 89040 

taataagtaa aagacaaggg acatgaagag gcagttcaca aaagaggaag ccaaaatgac 89100 

caataaacat gaaaggatgt ttaacctcaa aggaaacaag gaaatgaatt aaaaacatca 89160 

aatgccattt caaaactagt aagttggcaa aattaaaaat accaaggatg agaatatgaa 89220 

gcatggctat atgagtgcat ggaatggtac agtcactttc attaaaaatg cacataattt 89280 

gttttttatt tatttttttg agacagtcta tgtcgcccag getagaatge agtggcatga 89340 

tctcggctca ccacaatctc tgcctcctgg gttcaagcaa ttctcctgcc tcagcctcct 89400 

gagtagctgg gattacaggc acatgccaca acgcccggtt aagttttgta tttttagtag 89460 

agacagggtt ttgccatgtt ggccaggctg gtctcgaact cctgacctca ggtgagctgc 89520 

ttcccaaagt gctgggatta gaggegtgag ccaatgctcc tggctgaaaa aaatgeacat * 89580 

aatttgttac ctagcaattc catgtctaga ggcttatcct agagaaattc ttgettatat 89640 

gcataggaag acgtgtacta gaatgttcac tagttgaatg tttaagtgaa aattaggaaa 89700 

taaagtaaat gttcattaac aggaaaatga gtaaaggtat atttataaaa caattaagta 89760 

gctaaaatga ataaactaga gctgcgtgaa tgaactagaa ctggttcaat agtcatgtca 8982 0 

gattattgaa tgaatacagg tcagatatgt atagagtgtc atttgtgtaa ttaatttttt 89880 

tttttttttt gagatggagt ctcactctgt tgcccaggct ggagtgcagt ggcgtgatct 89940 

cagctcactg caacctccac ctcctgggtt aaagtgattc tcctgcctca gcctcccgag 90000 

tagttgggat tacaggcatg caccaccatg cccagctcat tttcctattt ttagtggcca 90060 

cagggtttca ccatgttggc caggctggtc ttgaactcct gacctcaagt gttccaccca 90120 

acttggcctc ecaaagtget aggattacag gcgtgagcca ccgtgctcag ecatttgegt 90180 

gatttttaaa gatgtgcaga ataatgecat taaaaaaaat acacatacat gtatatatat 90240 

acacgtttgg ctgggtgtgg tggctcacac ctgtaatccc agcactttgg gaggctgagg 90300 

caggaggatc acttgagccc aggtgtacaa gaetagectg ggegagatag caagacccca 90360 

tctcaacaac agaaaggata attaggtatg gtggcatgag aggatcactt gageccagga 90420 

gttcgagtgt tatcaggeca ctgcactcta gcctggacaa caaagcaaga ccgtgtctca 904 80 

aaaaaataaa aataaaaagt atttgtatgt ggtcatagtc aaaaaacgta catggaagga 9054 0 

aaatgtcttt atttatttat ttattttttt ttttttaaga cagagtcttg ctctgtcacc 90600 

caggctgggg tacagtggtg taatctcagc tcaccgcaat ctcggcctcc cgggttcaag 90660 

cgattcttct gcctcagcct tctaagtagc tgggactaca ggtacccgcc accacaccct 90720 

gctaattctt gtgttttcag tagagacagg gtttcaccat gttggcaagg ctggtctcga 90780 

actcctgacc ttaagtgagc cacccgcctt ggcctcccaa agtcctggga ttacaggtgt 90840 

gagccactgc gcttggccag gaaatatcta atttagtaag tatttatatc tgggaaagga 90900 

agggtcaggt ggtgattcat aggaactcta aagtctatgt ataatactta gggggacaga 90 960 

aggaaataaa geaaaatget gatatttgat tgttgagttg tgtatatgtt agaagtataa 91020 

cataggagat ctgattgata gtaggagaat gtttttaggt ggtaaaagtg gaaccgtggt 91080 

ggtttgtttt ggcagtagaa tcagttggtc atagtttgta tgtggaaggt aataaacaga 91140 

ccatgttaag gatgacttcc ggaattttgg tctgagtagt gggtggatga cagtgtcatt 91200 

catgagggaa gatgaagact gaggtaggaa caggtttggg agaagatgac atgttccctt 912 60 

ttagacaagt ggaattatgg aagatggcag gtaggtggtt agctatatga atttgagata 91320 

aaagatttag gatggagata taaatttagg agtaacagcg tatctatggt attgtaagee 91380 

ttaagaatgg gtaggatcag ccaggaaata cagatgtata tgcagaagag aggagtcaag 91440 

gaagecaaga caagttaatg tttaaagtga gtgatgtagt ccatgggcag atgctgctga 91500 

gagggctgea aacaccagtg accctacaac atttttaaat gtcgtcttcc tgacagcagt 91560 

gatcagtacc tgeaacgate ttatttattt ttttcatgtt agtctccaca cacttgaatg 91620 

tagacttttt gaaggcaaaa teattgeett ttctgagctg ggagcatgtc tggcacatac 91680 

caagcactca acagttgatg tattgacttc atccagatac tetgagggeg agttatttcc 91740 

tgetactage ctttcacctt tcaatgttta agagcacaaa tacagagatg ggcacgtttt 91800 

ggcatttctt attttgataa ccttttcctg gtaagatttt ttaatgttga aaaaaaaaaa 91860 

caagaaaaga gggttaaaaa tagtcttatg tcagatcctg tgatagaatt cacacttggc 91920 

ttaagctget gggcaccttc ctatcttgga tgtcatatta gcttatctac agcagaattt 91980 

ttactgtttt atgtagtaag gaagcaatta tatgattatt ttacagacaa attattcttt 92040 

atcttttatt tttttagacg gagtctctct ttgtctccca ggctggagta cagtgtcgcg 92100 

atctcggctc actgcaacct ccgcctcctg ggttcaagca attctctgcc tcagcctccc 92160 

aagtagctgg gcttacaggt gtccgccacc acacccagct cattgttttg tatttttagt 92220 

agagatgggg tttcaccatg ttggccaggc tggtcttgag ctactgacct caggtgatcc 92280 
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acccgccttg gcatcccaaa gtgctggaat tacaggcgtg agccaccgtg cctggcccag 92340 

acaaactatt atactctgag tgttagaggc ttaggatgtt ttcacttgat gctatgggag 92400 

gaataagtaa taagatatga tacacaacca aagacctttc ttcactatgc ttctagtagc 92460 

tagtactatg gatgacacat ggtaataata ttggttagca tttgtcctca atttactgtg 92520 

ctagttaccc ttctaagccc cttacaggta tatatttttt ttcatcaata atcctctaag 92580 

gtagttttta ttattgacct aattttataa atcaagaaaa ttaagaccca gagaagtaag 92640 

taactcgtcc aagatcacat ggcttataag tggtagagcc agaatttgac cccagatgtt 92700 

gtgactacat tgtctctcca taagcaggtt caactctttt gactggatgc tgttccaagg 92760 

tcacttcctt agagaagcct ttgctgacaa ctaccctcct gtgccctcct ccaaggctgt 92820 

ccattgttct agaactttga atactcatct tagaataaag ctggtctaat ttttacagtg 92880 

ttacagaatg gatctctgac tgcaaaagtt ggtcataatt atctttttat gttctagtga 92940 

aaggcaaaga acaagagaag acctcagatg tgaagtccat taaaggtaag ttctgccctt 93 000 

ggcagtccac tgcattaaaa agtgatgtgc tttgcatttg tgagttcttt aatcctgtta 93060 

tactqrctct tttggcatta atcatttctg ccttatttta taattactta tgattttgat 93120 

ttact:ccct ctttaacctg tataatgctt taacatctag catataataa gtaggctttt 93180 

tttttntttt tttttttgga gacggagtct tgctctgtta cccaggctgg agtgcagtgg 93240 

cgcgatcttg gctcactgca agctctgtct cccgggttca caccattctc ctgcctcagc 93300 

ctccccagca gctgggacta caggtgcacg gcgccacgcc tggctaattt tttgtatttt 93360 

ttagtagaga cagagtttca ccatgttagc cagtatggtc tcgatctcct gaccttgtga 93420 

tccgcccgcc ccggcctccc aaagtgctgg gattacaagc gtgagccacc gcacccggcc 93480 

gtaagtaggc trtttttacc ttaattttat ttttttgaga tggagtcttg ctcttatccc 93540 

caggccggag tgcagtggtg ccatctcggc tcactgcagc atccacctcc cgggttcaag 93600 

cgattctcct gcctcagcct cccgagtagc tgggattaca ggtggccgcc accatgccca 93660 

gctiaatcttt gtatttttag tagagacagg gtttcaccgt gttggccagg ccagtctcaa 93720 

actcctgacc tcaagtgatc cactcgcctt ggcctcccaa agtcctggga ttacaggcgt 93780 

gagccaccat gcctggccat aagtaggctt ttactgagcc ttgtgtgtat tggctatcct 93 840 

agtgatcaca gtgaaccagt gcccttctta ttaatcacac atttaattgt tccctaaaag 9390 0 

tgattagtitc actttattta tttagtaaga caaaaaatga agaatactct taactgagca 93960 

gtctgttaac tgtaggaaag cactgacact tataaggctt agttttctgt catttatcca 9402 0 

gaagtatggt tgattacagt ttttactttt ttatttgaat gaacaacctt aatttaaaat 94080 

atattttgtt tattttttgt tgggatcgat acattgtcct tgtttataga ttagagcatg 94140 

ctttttaaag atgctgtatt actcactgat tttatttgtc cagtgtacag agattgaagt 94200 

gggaaaatta taatggaaat tgtttccata gtcattacat attaatttca tcaatttatt 94260 

tccataaaat ctgtagattg ctacttattt agatttttcc ttcaaatgtt tttatgttgt 94320 

attgcttgca ctgagtattt attctatatg ctcaatttgc tggagaagaa gactaattat 94380 

aacttaggca agttgtaaaa ttagggaaaa aagtaaggta ccttacagcc tagtttactt 94440 

atttcttatg taaagccagt tagattccac attagttcaa actgccttct ttgagcaaaa 94500 

cttgattggc agtgataaag gcttaaagcc cttctcaagc agagacctgt aaagactaga 94560 

tctgactgta gtagaaggaa ggaacttaga tgtttcaggc agtgagaaca ccagtcttcc 94620 

actctaaac: ttgccactaa cagtatgacc ttgggaagtt gtaactttct tcagattctt 94680 

catttgttga atggggggat tggcctagct aatttctaaa tctctactgg gctaaaaaat 94740 

tctgtgctta tactctgatt atgaagtaca taatctgtgc ttaacattca ctgacttatc 94800 

cttaggataa tacagaagca gtacaagaaa cagcccctca agatgtttgc agtctggtta 94860 

gaaagacaaa cttatacaca gaacagtagc aaatagacca aaataataat agctgccatt 94920 

tatagaacac ttcttctgtt ctgggcatta gacaaaaact gactataacg gtgaacaaaa 94980 

aagacttagg tcctgccctc attgaactta cagattagta ggggagagga acattaatca 95040 

agtaattcca cagatggctt agcctagatt ggtagtgatg gaagtaaaga gatgtgaacg 95100 

gacttgaaaa aaaattcgga ggcaaaatgg atagaagttt attattgatt aaatatgagg 95160 

tgtgagagag agggatattt aagattgata cctaccttct ggcttgccta acagaaccaa 95220 

aacaggaaat tatatgttca gttttgttat gttgggtggg aggtgctttt gagtcattca 95280 

tttatatatg ttatatatgt tattttatat gcatagtaat tttaaggtct gagttttaaa 95340 

ccaaaggtta gagagtgatt ttttagagtc tagcaaacct aagttgaaat cctgcctgtt 954 00 

gaaatggctg tttactagct cattaaccta gggcaaagta ttcaacttgt tttcattttt 95460 

gtcttcatct ctaaaatgag gaaaatatgg tcttacaaga ttgtcctgag agatagatga 95520 

aataatatcc aaaaaaaaaa aaggtacata gagaaactcg tatagtgcct ggtatatagt 95580 

aggtcctcca ttggtagcta tcattatcta gttttaacat agccttcagt ttgttgaatt 95640 

agtcaaactg agtgaagcac tgcaaggaat tcagaggaat ttgagatcaa caaatgattt 95700 

ctgaagttta gggaagactt catggcaatg acacttacct tgtataaaag ttgaagaata 95760 

agaaagattt gaatgagaga ttctttctct tctccctacc agcccagctt cttatttgag 95820 

gatatattgg gcaaaggggc cttcagacaa gtagagggag atttttacag aaagattgag 95880 
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atgaaggtat agaaggctgt aaagaccaga aaagagaatt gagacagagg aagcaggaag 95940 

ccactgtagg tttttgagca agatattgat gctgtaagta tggtgtttat gaaaggttag 96000 

tctggaagag atttgcagga tggagacccc ggaagttttt ttgttataat acagaaagac 96 060 

ttgcactgag ggtgaggtgt taaaaataaa caggtaagta aatgtttaaa catcttgaag 96120 

gaaaagtcaa caaatcttgg caagtaaaca gataacagtg aaaaagaatg ggaccaagat 96180 

tttgagtttt ggagactggt ggattgaaca gacagggaaa ttgagaggag aatcagatga 96240 

tgatgttcta agttgatatt tagacagatt gtgcttgaga tggtaaagtc aatgtgggtg 96300 

ggaatgctta gtagcgagta atcagtgata caagaccaaa gcccaggtca aagacaagtc 96360 

acagatacag atcagggctt tttcatctgc tccacagagg tgtaccctag gagctgttgc 96420 

aaacagtcca tgtggagggt gtgagtaaga tgtttccctt gaatttgcca gaattacttt 96480 

ctcgttgttg ttgttgtttt ttctgagaca gattctcgct ctgttgccca ggctggaggg 96540 

cagtggcgag atcgcgcagc tcactgcaac ctctgcctct cgggttcgag tgattctcct 96600 

gcctcagcct cccaagtagc tgggattaca ggcttgtgcc accaagccca gctaatttct 96660 

tttgtatttt tagtagagat ggggtttcac catgttggcc agactggtct cgaactcctg 96 720 

gcctcgtgat ctgcctgcct cagcctccaa aagttctggg attacaggcg tgaaccactg 96780 

cacccggtcc cttgttaagt ttattttggt gggaagcaaa ggaggtttca gcttttaaaa 96 840 

agtrtgaaaa ttattgctct ggtaataatt aaagatttga gagtaaatat gctttctagc 96900 

agaaagaata aaagaagaac agatagcctc aagaagggga gccaaagaag caggctatat 96 960 

ctgacacact gggtgttgat aaatgggtat taaaagaatg agagcaatga gcagatagaa 97020 

gaggaaatta ggagagtata ataccatgga gaccaagaaa gatagactat caggaaggag 97 080 

tggtaaaaat aagttactag ttctaagaga gatgttaaga gggaccgggg aaagccttgt 97140 

acaaatgagt tagtagcatt ttacattata tacatctaat taagaaacaa tgcgagagtc 97200 

tcaccattcc tatagactct tacttgtact tgtctgaaca cgaaaactgg cttttgttta 97260 

taaataagct aaaaattatt ttgctccaat ttctcatgaa aataaaaata aaccttcttt 97320 

taacattgaa aaaatagttt gaagacagtc actcttcatt ttgtaattcc cacaactatt 97380 

attgaatgac tgaaattatc tttattctga agccaaaggg gtgatactga tatttcttca 97440 

gactactaaa aatatatttt atgaattttt agtgtgcttt atcttttttt gttttttttt 97500 

ttgagatgga gtttcactcc cgttgctcag gctggagggc agtggtgcaa tctcagctca 97 560 

ctgcaacctt cgcctcccag attcaagcaa ttctcctgcc tcggtctccc aagtagctgg 97 62 0 

gattacaggc acctgccccc acacccagct aattttttgt atttttagta gagacagggt 97 680 

ttcaccatgt tggtcaggct ggtcttgaac tcctgacctc aggtgatcca cccaccttgg 97740 

cctcccaaag tactgcgatt gcaggcatga gccaccatgc ctggcctgag gaatattttt 97 800 

ctaggttccc cccaccccaa gcatttattc tgcaatttta gttttgttcc taaagcaagc 97 860 

aaggtttaag gatttaaaaa taatccgtat tttagaatgc tttctggctt tgttactttt 97 920 

tatccacagt agaagttctc agagaatgat ctccctcttt taatttaact ttttggcaca 97 980 

gtatttcgag aattataaat aatattagaa tgttttctgg ctgggtgtgg tggctcatgc 98040 

ctgtaatcct ggctacttgg gaggctgagg caggagaatc acttgaacat gggaggcaga 98100 

ggttgcagtg agccgaggtc atgccactgc actccagcct gggtgacaga gcaagactct 98160 

gtctgggaaa aaaaaaaaaa aaaaaaagag tgttttcttt cctattttcc accacttgat 98220 

taagttactt ttcctcttaa gtattzttttg ctgagtatgc tgacttaaga gtaatgttac 98280 

aaaatttaat ttttaaagtt ctctgaaagc ccctttatga gagttttagg ctatcaaatt 98340 

gtgtttaatt cttaacaatt ttttgaaaaa ttatagcttc aatatccgta cattccccac 98400 

aaaaaagcac taaaaatcat gccttgctgg aggctgcagg accaagtcat gttgcaatca 98460 

atgccatttc tgccaacatg gactcctttt caagtagcag gacagccaca cttaagaagc 98520 

agccaagcca catggaggcc gctcattttg gtgacctggg taagtaacta tcatttttta 98580 

ttaacttgta ttagaaggat ttgagtacaa tatgtgaaac ttctgtcata ggatacagaa 98640 

ctatataatt ggaaagtgct ttggaaaaaa tgtatttaaa ataacagcta caagtataat 98700 

gggtagctgt gttgtgttcc tgtaaatata gaatataaag catgcccagt agaaaaacaa 98760 

gcatttccag aagaaatata tctgatcact aaatataaat atatgaaaaa gatgtctcac 98820 

tttattactg agggaagtgc aaattaaaat aatcagttaa tgttctccta acacattagc 98880 

atatttttta aagtttgaca atttgaatgt cagtgaagat gcagggaaat acccctccta 98940 

tttagtgata atataatctg gtgaagactc tttggaaagc aatttggaaa tcagtataaa 99000 

atatgcatgt catttaggcc actctttcta agacctagcc ctcagatatg ctcattcata 99060 

tgtgcaggtg tgtatgtgtg tgtgtgtgtg tgtgtgtgtg tgtatatgta tgtatgtatg 99120 

tatgtatgta tgtatgttga aggctattca ttatagtatt gtttgtgata gcaaaaaatt 99180 

atggacaaca tataaatatc tgttataggg aaataaccaa attgtggtat acgcatgctc 99240 

tggagtataa tatagccatt tgtttctatt tatttatttt cttgagacag ggttttactc 99300 

tgttgcccag gctggagtgc agtggtatga tcatggttca ctgcagcctt cacctcctgg 99360 

gcacaagcca ttctctcgcc tcagcctcca gagttactag gactgcaggc atgtgtcacc 99420 

acacccagat aattttttaa ttttttgtag agacagggtc tcactatgtt gcctaagctg 99480 
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gtctcaaact 

ccaacgtgaa 

gaaaatgtct 

ccgttatatt 

cagtatttct 

aaaaaaagat 

ctctactttt 

taatacatag 

cactcataaa 

tctttctttc 

gtacagtggc 

cctcagtctc 

tatttttttt 

aactcctgga 

tgagctactg 

tgtagacgga 

cgcctcccag 

gcctgccata 

ggtcaggctg 

gctgggatta 

gcaattacca 

ccttccaata 

ttcttcagca 

aggaacaaac 

gacaaaaacc 

aatagagaaa 

tgagaggggt 

agttgtatct 

acaagaatga 

tcagtatctt 

aagtgtacaa 

cccaggccgg 

ctggaattac 

atgtgaatct 

atcagaattt 

gactgaggat 

agaactcttt 

tgatcatgct 

gtggttataa 

catttttcaa 

gttcataaca 

gaattactgg 

cctccaaaaa 

tttcctgtgc 

aatgcatttt 

atatttttat 

agtctgggct 

ggtggattgc 

tctactaaaa 

taggaggctg 

cgagatcgcg 

aaaaaaaaaa 

attgagtaac 

agaaaccaaa 

tttattcaag 

aaattatttg 

ggactaccag 

cgacactatt 

ggtgaaattt 

acacagtcta 



cctggcctca 

ccaccacacc 

aaggcatgtt 

aaaataagtt 

tacccaaatt 

ccatgaacca 

gtgtatattc 

cagacaaaat 

ttgctgatga 

tttttttttt 

gtgattacaa 

ctgagtagct 

tttttttttt 

atcaagcgat 

tgcctggcct 

gtctcacagg 

gttcaagcta 

atgcctggct 

gttttgaact 

caggcatgag 

tatgacctag 

aaaacctgtg 

ggtgaatgaa 

tgttgataca 

agtccctaaa 

ttaagagaaa 

aagtgggtgt 

tggcagtgga 

gtatagataa 

agagtgatat 

gggatctcta 

tcttgaactc 

aggcgtgagc 

agcattatct 

cctcaagttt 

gaagacacga 

tgacaaattg 

atgaaagcca 

tttaaattta 

ggtacgatct 

tctttgtaga 

actgaaaata 

ggttttgcca 

ccttgttact 

atgttaattt 

tggccccttt 

gggcgcagtg 

cgaaggtcag 

gtacaaaaac 

agtcaagaga 

ccattgctct 

aaaaaagaat 

aaataacttt 

agcatagtat 

gtctctggta 

ttgctgaact 

actcaagaga 

gtcctccctt 

tggttagagg 

aacacagtga 



agcaattctc 

tggttcagtg 

taaatgtgag 

cttccaaaac 

tctgcactta 

atggacttct 

aaaccagagt 

gcatatagct 

gaatttaaaa 

tctttttgag 

ctcactgcag 

gggactatag 

gtagagacgg 

ccacttgcgt 

aggcagtttg 

ctggagtgca 

ttctcctgcc 

gatttttgta 

cctgacctca 

ccgtcatccc 

cagttgcact 

cacaaatgtt 

ctggttcatt 

tttaaccacc 

gactacatat 

tgaaaagatt 

agttataaaa 

tgcagaaatc 

aactggggaa 

tgtactatag 

ggtattatta 

ctgggctcta 

gaccatgcct 

catagaattt 

gtgatgttga 

cgtgcttcaa 

atgaaaccct 

atttttaaaa 

gttaaatata 

caaagctact 

tatatccaca 

atgcagtttg 

gtttacatcc 

gcttaataat 

gcttttctgg 

ggaactagta 

gctcacgcct 

gagtttgaga 

tagctcagcg 

atcgcttgaa 

ccagcctagg 

ttacatggtc 

ttaataattt 

ttgtagtttt 

ccagttgttg 

gctaattctt 

ccaaatcaag 

acttcattca 

ctgaaagttt 

agcagagctc 



ccacacaggc 

tagccattta 

aaaagcaagt 

aaaaacatat 

gaaaattgca 

aataaaatca 

gtcaatgtgt 

cagagagtaa 

tggtgcagat 

acagggtctc 

cctcaccctc 

gcatgcacca 

ggtttcgcca 

aggcctccca 

tttgtttgtt 

gtggcccaat 

tcagcctcct 

tatttagtag 

ggtgatcagc 

tggctggtgg 

ctgtatttat 

catagcagct 

cataccatgg 

tggatgaata 

agtatgattc 

agtgtttgcc 

gtgcaacatg 

tcaatgtgat 

atctgaacaa 

ctttgcaaga 

tttttttaga 

gtgatccgcc 

ggccctttca 

aattaaaaga 

caaagatgaa 

aaaaatgatt 

cagtcagttt 

aaattttttg 

agataaatga 

ctttaaccta 

attttccctc 

ctaagacttt 

tcatgaccag 

ttttgaaaaa 

gatttttaat 

tcataagttt 

gcaatcccag 

ccatcctgac 

tggtggcggg 

cccgggaggt 

caacaagagt 

tgaattgcca 

aggcaagttt 

tttatttact 

ctaaaagtga 

ttgcttctat 

cctttctaag 

attcatggaa 

tcattcaaca 

actggctgag 



ctcccaaagt 

gaaatctaaa 

cacagtatgc 

gcaggagacc 

tgtcatgttg 

gtcctgcttt 

ttgtggggca 

aattgtaagt 

gctctggaaa 

actctgttgc 

ctcaggttca 

ccacgcctgg 

tgtttcccag 

aagtgctggg 

tgtttgtttg 

ttttggctca 

gagtagctgg 

atatggggtt 

ccgcctcggc 

tttcttatga 

cccagataaa 

taatattgaa 

aataccattc 

tcaagggaat 

cgtttggata 

agatgttaga 

agggatcttt 

aaaattaccia 

gttagagtgt 

tgttaccatg 

gatggggttt 

tgccccagcc 

gtattgtatc 

aattgtaaac 

ctagttgaca 

tgaatatcaa 

tataagaatg 

tctttcctaa 

ttttttatta 

ctatgaatga 

aggataagtg 

gctatctgtt 

cgaatgagag 

aatctaattt 

gaggttgagt 

tttttcttaa 

cactttggga 

caacatggtg 

tgcctgtaat 

ggaggttggt 

gaaaagtctc 

ttaaaagaga 

tggacgattg 

ttagttgcta 

ttgactaatc 

cttttaggca 

acccttgaac 

cttcggcgaa 

acttggtcgc 

cctgtctctc 



9 c tgggatta 

aaagacgtgg 

atggtaaaat 

tttattttgt 

tcataagttg 

tgacatctct 

cacttagcaa 

tttgctagat 

acaggcagtt 

gcaggctgga 

ggtgatcctc 

ctaatttttg 

gctggtctca 

attacgggcg 

tttatttatt 

ctgcaacctc 

gatgacaggt 

tcaccatgtt 

ctcccaaagt 

cgtgaaacat 

tgaaaactta 

aaactggatg 

agcaataaaa 

tatgctgtca 

atattcttga 

gacagggagg 

gtgatgttga 

agaactaaaa 

tgtatcactg 

ggagaaacta 

cactatgttc 

tcctaaagta 

ttagaacttc 

ctcacagaag 

ctgacagtaa 

tggattaaga 

cccatcttta 

caattagctt 

agtttagttt 

ataatgctga 

cctacaagtg 

cctgaatgct 

tgttgcctat 

gacagacaaa 

atagttttta 

gaatttatgt 

ggccgaggtg 

aaaccgaatc 

cccagctact 

tgcattgagc 

aaaaaaaaaa 

tatgagaatt 

tactttgttt 

ggaagtaaac 

tgtcaatctg 

gatcttgtct 

aagtcttgca 

tggagcattt 

gaataagagc 

catctaaaaa 
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gcatgaaact 

ctcagcacag 

cactcagaat 

aatggccaga 

tacagtagcc 

gaaaagtgag 

tactaagcag 

tacttgtaaa 

gagcaggagt 

aatgtcctac 

ttcctccctc 

aaaattttta 

ggattttatt 

ttttgagatg 

actgcaacct 

aggcgcctgc 

catgttcgcc 

aagtgctggg 

agaaaaccag 

ctagaagcat 

atactcttaa 

aatgttcttt 

tcagtgaact 

ttcctccctc 

gcgcccgcca 

tggtcaggct 

tgctgggatt 

actgtaagct 

gagaagattg 

gctcaaacaa 

tcttattcct 

ttaaatgtat 

atttctttgt 

gtgaacatgt 

gttacttcag 

tctggtacca 

caggtttata 

ctggagtatt 

tctgaattcc 

tttctgcctc 

gagagaattc 

tgtttctgta 

caatgacgta 

ctcagccccc 

atttttagaa 

gtgatctgcc 

tttattgttt 

cagtggagtt 

tatatatagt 

agattcctgt 

gtcaagcccc 

tttgcctttt 

ccgagctgga 

gcagttctcc 

ggctgatttt 

caaactcttg 

cgtgagccac 

tttttcttgt 

gtcatgaaga 

gcttatgcct 



acagcgtctt 

ttgtttatga 

cacttgctgc 

gcaggaactc 

agtagaaata 

tatgtgattt 

aacttcagat 

tttgggagaa 

gactggacct 

ttttcccctc 

ggtcttaatt 

aaaattgcca 

tattagtcac 

gagtcttgct 

ccgggttcat 

caccacaccc 

aggatggtct 

attacaggca 

ggcttagaaa 

tttgacaaga 

ttatcacctg 

gtgtcttaaa 

aaaatgaggt 

cctccctcct 

ccactcctgg 

gatcttgaac 

acaggcatga 

gggagaagtg 

cttgagccca 

agaaaaaaag 

ttcacccttc 

atttgtctga 

ttcttcggat 

ttcttggact 

gtgttttgca 

cttaaaactg 

cttactgtag 

aatatgctct 

agaatactac 

ccactatttt 

agtattggga 

attgtttttt 

ctctcagctc 

tgagtagctg 

gagatggggt 

cacctcagcc 

ttagaaactg 

attaaaagag 

aagtttgacc 

aactgtcacc 

acctctatcc 

ctcttttttt 

gtgcagtgag 

tgccttagcc 

tttgtatttt 

acctcaagtg 

tgtgcccaat 

atggattgtg 

ttttctcata 

gtaatctcag 



ttttaactga 

ctcattcaga 

tttcccagga 

accaagtttc 

gtcccgcttc 

tcttgtgtgt 

gaggaataaa 

tttggagagt 

tctaagaagt 

cactgatttt 

ttattaatat 

ataagtgaca 

aagacctttg 

ctgtcgccca 

gccattctcc 

ggctaatttt 

cgatctcctg 

tgagccaccg 

ggttaggtaa 

gcacctgttt 

ggattttgat 

gggctaagtg 

ctaatctgct 

tccttccctc 

ctaattttta 

tcctgacctc 

atcaccacac 

gcacacactt 

ggagttttga 

ttattgaatt 

attcccactt 

taattctgct 

tcagactgtt 

tttgtctgtg 

ttttcttttg 

aatttttgtc 

aaatatggtg 

ctgttaaact 

tggccccaaa 

ccttagttta 

agagtttcta 

ttttgagatg 

actgcaacct 

ggattacagg 

ttcgccatgt 

tcccaaagtg 

tctttgcttt 

cattagttac 

tttttaaaat 

actataaggg 

caacacttgg 

ttcttatttt 

gcaatctcgg 

tccctagtag 

tagtagaaat 

atccacctgc 

caggactttt 

ccttcagagt 

tgtttccttt 

cactttgaga 



ttctcttgat 
aggaattgac 
atgtgacagt 
catggaaacc 
tccactaaaa 
acatatgtgt 
atgattggaa 
gtagtagagt 
gtgttatcag 
gacatcaaac 
tttactgcac 
tttattaagt 
tgcaggtagt 
ggctggagtg 
tgcctcagcc 
tttgtatttt 
actttgtgat 
cgcccggact 
cttcctctag 
ttttttcttc 
tagacagcct 
atttcttcag 
actgaatcaa 
aaccaggctc 
tattttagta 
aagtgaccca 
ctgacggcat 
gtactcccag 
gaccaacctg 
ttttatttct 
ttgatcccat 
atctacagtt 
ggtggcttgt 
ggaattctct 
ccatgcacct 
ttgggtgctc 
tttgattatg 
taatgtgttg 
tgtttaagat 
acacaaactc 
acctgtttct 
gagtctcact 
ccacctcccg 
tgcccaccac 
tggccaggct 
ctaggattat 
agtggtaatt 
atttttccct 
gtatacttgt 
taaagaacag 
caaccgctga 
tttttttgag 
ctcactgcaa 
ctgggattat 
ggggtttcac 
ctcggcctcc 
tttttttaaa 
cacacctaag 
taaaagtatt 
agctgaggtg 



aagagattgg 

ctgaataata 

gcccattctc 

caagaatctt 

gaattgtcag 

ctcactttct 

tatttttttt 

cagatcagtg 

aattagtaaa 

cattatccac 

tttgcagata 

tcagtgctta 

aggcatgatt 

caatggcgcg 

tcccaaatag 

tagtagagac 

ccgcctgcct 

gattatctta 

gttgtacagt 

tctattagtt 

tcatgttctt 

atcttttagt 

gttttcagca 

ccgaggagct 

gagacggggt 

cctgcctcgg 

gttattttca 

ctactcagga 

ggcaacacag 

atggatcatt 

cttttattta 

ttttgtggac 

gattttagtg 

gtgtactctg 

ggggcctggg 

gtactgatcc 

gggtattgtc 

tccctgtaaa 

aagggcactg 

acctttttaa 

ggaaatggaa 

ctgtcaccca 

ggttcaagcg 

catgcctggc 

ggtcttgaac 

gtttctgtaa 

ttcaataaaa 

ttttcattat 

atcagtttta 

ttagttcctt 

tctttctccg 

acagcgtctt 

cctccgcctc 

aggcacgcac 

catgttggcc 

caaagtgctg 

tttacattca 

agccctttgc 

gtggttggcc 

ggcagattac 



aggattctgg 
gaactaacag 
tccgtcttga 
cctctacact 
gaaaactaat 
ttttttaatt 
ctcctctaac 
tatggaaaag 
tgaagggtca 
atagccttat 
aaatttttaa 
gtgtatattt 
atcttttttt 
gtctcggctc 
ctgggactac 

ggggtttcac 

cggcctccca 

tttacacatg 

aaatgtggac 

tagaaattat 

tttcatctta 

tcactcattc 

tgttatttcc 

gggattacag 

ttcaccatgt 

cctcccaaag 

tcgcaaagtt 

agcttaaggt 

caagacccca 

ttttgtagtt 

tttagtttta 

ctgactcagc 

atttttggcc 

tataaattaa 

tcactaccct 

tgtatgagta 

ccagatggtg 

actccaaaat 

cctgtatttg 

aaaacatttt 

gtccaaagtc 

ggctggagtg 

attctcttgc 

tgatttttgt 

tcctgacttt 

ttgtaataca 

atagaaatag 

cttcaaatat 

acacatacat 

cacctttgaa 

tctcaatagc 

gctctgtcgc 

ctgggttcaa 

caccacaccc 

aggctggtct 

ggattacagg 

acttgtcatt 

ctaagcaaag 

aggtgccatg 

gaggtcagga 
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gatcgagacc 

aaaaaaatta 

gcaggagaat 

cactccagcc 

ttttacactt 

ttacgtcaga 

ccatttgttg 

gtctaggcct 

taataccaca 

ataaaacgaa 

gtagaattgg 

ggcctagggt 

gtgcagtggt 

tgcctcagcc 

tgtattttta 

ctcaagtgat 

tgcctggcag 

tttaatagat 

tttatggcct 

tatagcattt 

ttgtctttct 

ttttgatttt 

tttccattat 

taaagtagaa 

ctgtaaattt 

ctgagcacaa 

ttagaaatat 

catttatttc 

ttaattttta 

aaagattgta 

taatgatggt 

ttattactca 

agctctatct 

aggatgatat 

aattttcctt 

ttcatttgat 

ttataattct 

tttatttatt 

aagctggagt 

gattctcctg 

tgattttttg 

ctcctgacct 

agccactgca 

gttggtaaat 

tgttgcactg 

actcataata 

ttctccctct 

tagtatttgt 

acaatagttg 

cagaggaccc 

gtgatgactc 

ccacccttaa 

gggggtaagg 

aaaatggagt 

tttcttttcc 

cgttctcaat 

cctcacttcc 

ggccgggcgg 

gaccccccac 

cctcccagat 



atcctggcta 

gccgggcgtg 

agtgtgaacc 

tgggcaacac 

tacgtttaga 

ttttttgttt 

aaaagacaac 

gtttttggac 

tggtcttaat 

ttgggaagtt 

tgtcatttct 

tttgtttttt 

gagatcttgg 

tcccaaatag 

atgcagacag 

tagcccacct 

gggcctaggg 

ataggactat 

ttgagtaatt 

cgggtttgta 

ttgtcagatt 

tctgttgttt 

ttctgcttgc 

acttagattt 

ccttctaacc 

atgaaatgtt 

gttatttagt 

tcatttcatt 

aaaataacat 

cattctgttt 

gttcagtttt 

gaagagtgtt 

ggttttgctt 

cttctgggtg 

gttctaagat 

tagtgcttgc 

atttaaaggg 

tatttattta 

gcagtggtgc 

cctcagcctc 

tatttttagt 

caggtgatcc 

cccggctgag 

ttaattattt 

gggtatttat 

atattaatat 

ttgatttccc 

tgatcattct 

agggaaggtc 

tgcggccttc 

ttaacgagca 

tccatttaac 

ttatagatta 

ctcccatgtc 

ccacatttcc 

gagctgttgg 

cagatggggc 

a 99cgccccc 

ctccctcccg 

ggggcggctg 



atgcggtgaa 

gtggcgggca 

cgggaggtgg 

agtgagactc 

tatatatctt 

tttgtttatt 

ctttactcca 

tcctttttct 

tactgtatag 

tttattttta 

tctttagata 

gtgtgtgaga 

cttactgcaa 

ctgggattac 

ggtttcacca 

tggcctccca 

ttttcttttt 

ttaagttatc 

aattgtattg 

gtggtatccc 

gtatagggat 

tgttttcaat 

tttgggttta 

ctggtttgag 

actgctttag 

ctaatttccc 

ttgcaagcaa 

atattatggt 

taaaaaattt 

ttggacagtt 

tctttattct 

gaactttcca 

catgtatttt 

aattgcctgt 

cagaaatatc 

atggcatatc 

ggcttcttgt 

tttatttatt 

aatcctggct 

ccaagtagct 

agaaacggat 

acctgctttg 

tcatgttatt 

taatataaat 

aatgtgtaaa 

ctttggattt 

cttttttgct 

tgggtgtttc 

agcagataaa 

tgcagtgttt 

tgctgccttc 

cctgagtggt 

acagcatccc 

tacttctttc 

cccttttcta 

gtacacctcc 

agccgggcag 

cacctccctc 

gacggggcgg 

gccgggcggg 



accccatctc 

cctgtagtcc 

agcttgcagt 

catctcaaaa 

ttttgagtta 

tttacatatg 

ttgaattgcc 

gtttcatgat 

taagtcttaa 

ctcttatttc 

tttggttgaa 

cagagtctca 

cctctgcctc 

aagcgtgtgc 

tgttagccaa 

aagtgttagg 

cagagtattt 

tgtttcttct 

aattgtcaaa 

tcttttattc 

ttattagtct 

tttattgatt 

ttttactctt 

acctttcttt 

ttacaccccc 

ttgaatctta 

ttggagactt 

cagagaatat 

tttaaaatgt 

ttctataaat 

tgctgatact 

actacaattt 

gaggctctgt 

tttatcatta 

tgttgtccaa 

tttttccatt 

aggcagcata 

tattgagaca 

taccacaacc 

gggattacag 

tttcaccatg 

gcctcccaaa 

tttaatcttt 

tttagtataa 

tataattatt 

agattaccag 

tttttttttt 

ttggagaggg 

catgtgaaca 

gtgtccctgg 

aagcatctgt 

aatagcacat 

aaggcagaag 

tacacagaca 

ttcgacaaaa 

cagacggggt 

aggcgccccc 

ccggatgggg 

ctggccgggc 

ggctgccccc 
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cagctacttg 

gagccgagat 

aaaaaaaaaa 

atgtcgtata 

gatgtctagt 

tttgtacttt 

gtgtgtgtct 

aattgggtaa 

cattttctag 

ttgggaagtg 

cttctgtcac 

ccaggttcaa 

caccatgccc 

gctggtctcg 

attatagatg 

taaactatga 

tgagtgaatt 

tttatgagcg 

ctggtgttgg 

tttcaaagaa 

ttctgctctt 

ttttttttct 

tctaagataa 
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ttcttttacc 

ttttcctgtt 

attttgaatg 
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tgttaggtgt 
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tttttacttt 

tagttgggta 

gagttttgct 

tccacctcct 

gcacgcgcac 

ttagccaggc 

gtgctgggat 

tctcacaata 

ttatttacat 

ggtattaata 

tttagtatat 

ttttaattct 

ggatttggca 

aggtctctgg 

gtacttgaga 

ttaacaaagc 

gtttcagaga 

aatttttctt 

cagtaacaat 

ctgccatcgt 

ggcagctggg 

cacctcccag 

cggctggccg 

gggggctgac 

cacctccctc 



a 9 a 99ttgag 

cgcgccactg 

agtattatgg 

agtatgaggg 

tgttctaata 

tgccatattt 

attcctttgt 

tgctggcctt 

aagagattgt 

atgccatctg 

ccaggttgga 

gttatcctcc 

gactaatttt 

aacttgtgac 

tgagccaccg 

attcagatta 

tttactgtag 

tgtaattatt 

caattgtgtc 

ctagcttttg 

tattatttct 

ccaagttgct 

gcatttaata 

gtattttgaa 

aatgaattat 

atttttctac 

atttcattta 

catacagtat 

tttagttggt 

ttatatcact 

ttttactttc 

gtacacattc 

tctttatggt 

cactgcagct 

tgatctacct 

gtgttattta 

cttgttgccc 

gggttgcagt 

catgcctggc 

tcgtcttgaa 

tacaggcgtg 

cagggttttt 

taaatgtaac 

taattatatt 

gtttttctgt 

tatttttttt 

gggtcatagg 

ttttcctaga 

ttagggagtg 

acatcttgca 

gcagggggtt 

agtacagaac 

ctgatctctc 
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